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Abstract 


Because df t heir_cap abilities for adaptation, nonlinear function approximation, And par- 
allel hardware i]nplertentation r ^n.eural networks have proven to be well-suited for some 
important control applications. 

However, several important issues arc presort in many real-world neural- network control 
application^ that have net yet been addressed effectively in the literature. Four of these 
important generic issues are identified and addressed in some depth in this_iliesis as part of 
the development of an adaptive no.ural-nctwork-based control syste m for an exp erimental 
free- flying space robot prototype. 

The first issue concerns the importance of true system-level design of the control system. 
A now hybrid strategy is developed hero., in. depth, for the beneficial integration of neural 
networks into the total control system. The basic philosophy is to borrow, heavily from 
conventional control theory, and u$,e the neural network as a key subsystem just where its 
nonlinear, adaptive, and parallel processing benefits outweigh the associated costs. 

A second important issue in neural network control concerns incorporating a priori 
knowledge into the neural network. In many applications, it. is possible to get a reasonabty 
accurate controller using conventional means. If this prior information is ui?ed purposefully 
to provide a starting point for the optimizing capabilities of the neural network, it can 
provide much faster initial learning. In a step towards addressing this issue, a new generic 
“ Fully-Connected Architecture'* (FCA) is developed foi use with back propagation. This 
FCA has functionality beyond that of a layered network, and these capabilities are sliowu 
in be particularly beneficial for control .tasks. For example, they .provide the new ability to 
p.te- program the neural. net work directly, with a linear approximate controller. - 

A third issue is that -neural networks are commonly trained using a gradient-based 
optimization method such as back&ropagatioiu. but many real- world systems have discrete- 
valued functions. (D Uhls) that do not permit gradient -baaed optimization. One example is 
..the on-off thrusters that are common on spacecraft. A new technique is developed here that, 
now extends backpropagatitm learning for use with DVFs. Moreover, the modification to 
backpropagalion is small, requiring (1) replacement of the DVFs with continuously differ- 
entiable approximations, and (2) injection of noise on the* forward sweep. This algorithm 
is applicable genetically whenever a gradient-based optimization is used for systems wit h 
discrete-valued functions It is applied here to the* control problem using <;n-ofT thrusters. 



as well as for training neural networks built with hard-limiting neurons (signunis instead of 
SiginoidS). 

The fourth issue is that the speed of adaptation is often a limiting factor in the imple- — 

mentation of a neural-network control system. This issue lias been strongly resolved in this 0 

research by drawing on the above new contributions: the FCA and an automatic growing 
of the network combine to allow rapid adaptation in an experimental demonstration on a_ 

2-1) laboratory model of a free-flying space robot. The neural-network controller adapts 
in real time to account for multiple destabilizing thruster failures. Stability is restored 0 

within 5 seconds, and near-optimal performance is achieved within 2 minutes. This perfor- 
mance is obtained despite the implementation on a serial microprocessor; implementation 
on parallel-processing hardware would provide dramatically faster performance. 
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Chapter 1 


Introduction 


This dissertation presents generic theoretical and experimental investigations into the use of 
neural networks for control. A$ a significant “challenge problem** a free-flying; space rcba: 
prototype equipped with on-off gas thrusters was controlled well, despite major thruster 
failures, by using a new, hybrid neural- not work -based rcconfigurable control system. This 
research was conducted at; the Stanford University Aerospace Robotics Laboratory (AUL) 
at: Stanford University fromd990 to 1904. 

1.1 Motivation 

Due to their capabilities for adaptation, nonlinear function approximation, and parallel 
hardware implementation, neural networks have proven to be tvelJ suited for control appli 
catio* They have been used successfully bv engineers in the diOfticnl-prorobbing. indus- 
try steel industry 1 1 G] (LTj *17) [5-1], and semiconductor* processing industry (17), 

a*, well as a number of research applications ( # iOj [21] .(3$) [Cl] [07]. In some rase.? 1 licit 
learning, abilities and hheiont nonlinear nature allow them tu solve control problems and- 
provide performance unmatched by conventional methods. In otlwr cases their distributed 
nature and resulting; computational power allow them to implement known solutions more 
quickly and robustly than conventional serial pmressors. 

Neural networks derive their advantage in solving very complex problem* froth the emer- 
gent properties that come with the inassivo in ter con net tion of simple processing units. With 
good training techniques, the networks arc capable of implementing, very complex behaviors, 
for example, neural networks ti.av be used implement arbiiraiy mappings of inputs: to 
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outputs, such as from sensor signals to Actuator commands in A control problem. Further, _ 
since the mapping can he -taught indirectly^ neural networks Are especially attractive for 
poorly understood systems - they can generalize from training inputs and then respond by 
interpolation in untaught situations. 

Due to the distributed nature of the processing ancLtheir Adaptive capability, .networks 
are often robust to internal, component failures. Even without restraining, .the distributed 
processing gives the network the ability to withstand failure of several neurons without 
significant impact on the functionality. In addition to this, if on-line re* training is used, the 
remaining processors can adapl to account for the failure. Robustness is also contributed by. 
the network's ability to adapt to changes in the environment, plant, performance criteria, 
etc. 

Those features of neural networks make them particularly attractive for control appli- 
cation*. Several c»f these features will prove useful iri the control application presented 
here. 

The central question is when - and how - will the incorporation of neural network 
components provide Z\ clear, cost-effective advantage in real-time control? 

One central goal of this research, then, is to study the use of neural networks for con- 
trol, and to determine the characteristics cf control applications that can benefit from the 
application of neural networks. In certain eases, the itteryfnp of neural network technology 
with control* systems engineering can lead to the development of highly capable control sys- 
tems. Much neural- nor work theory and control theory already exists such that, significant 
advances in control capability could he produced simply through their astute integration. 

1.2 Research. I&ues 

Neural Artworks have proven thefiisrlves* valuable in a number of control applications. Sw 
for example [20j {2 l] '54j (64}. There arc, however, four important issues, that arc often -most — 
gcunane in a. real-world cutilrol application, that have rot yet been addressed effectively in 
the nc irabimtwork litefattire; 

l. Tor a gjvcti control need, should a neural network be used? 

• Does using a neural network provide a dear advantage over not doing so? 

« If it dor vs, then to achieve that advantage optimally, jutd when* in the control 
ssstenudlonld the neural network he used: ami where should it n«i? 
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2. A priori knowledge is often available in the form of.models of the system’s key compo- 
nents, and a preliminary control .design (e.g. provided, by ^conventional” control design 
techniques). Is it possible to use this a prion information toJmprave greatly the per- 
formance (Ledbetter jnitial performance* final convergence to a better solution) that 
the neural network can then enable? 

3. Many control applications involve the use of discrete* valued devices. For example, 
thrusters often operate "omoiT” rather than with analog- valued outputs. This presents 
a problem for backpropagation learning* since these discrete-valued functions arc not 
continuously differentiable. Is it possible to modify backpropagaticm to accommodate 
the discrete* valued functions? 

4. Speed of learning is very often important in real-time control applications. It i;» 
generally accepted that neural networks can run quickly during implementation (i.e, 
once the* weights have been selected) due to the availability of parallel hardware; but 
the speed of learning (i.e. finding the weight valued) is a separate* ver> critical* issue. 
Can backpropagiitiojv based learning be made fast enough to be feasible for rapid 
ondind adaptation? 

A "challenge problem” was formulated to focus the study of these important issues: 
a reconfigurahlo neural-nctworkbased adaptive control system was developed and expert 
mentally demonstrated on a freo-flyinj; space robot prototype. In addressing this challenge 
problem, the issues were studied, neural-network developments were made, and a working 
reconfigurabl* control system was developed [69] [70] [71] [72; [73], 

The experimental apparatuses shown in Figure 2.1. Specifically, the air^bearing-sup- 
ported robot’s position and attitude arc controlled .with eight on-offgas thrusters. Tlu* 
lask was this: after the random, severe mechanical failure 1 of a number of those thrusters, 
identify the iieWMhfuStcf-system characteristics, and reconfigure th»? control system to regain 
stability ami near.- optimal performance. This challenge problem is interesting not only for 
its practical applicability to spare operations pur but also — and oven mors pervasively 
as an application that raises and focjse* on several important fundamental generic issues 
iri neural-network control. 

The challenge problem addresses the find isMie. since it is a fairly complex, yet realistic- 
control problem. Also, the exrelletu experiiueuti.l performance nf a pre-existing conventional 
control approarh is available for rornjn fison; this ib valuable for Evaluating I lie; perfuimaticM 
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trade-offs between neural and conventional approaches. The Application alsodielps to moti- 
vate the second issue, a desire, to make use of a priori knowledge: an approximate solution 
cart bo Calculated quickly before neural-network training begins. The desire. to ufic this & 
priori information to accelerate learning js especially present here due to the need for .rapid 
reconfiguration. The .exist once of on-off thrusters requires the. development of a learning 
method .to deal with discrete- valued functions, highlighting the third -issue, .Finally,, the 
speed-of-Iearning issue is relevant, since stability must be regained .quickly due to the limits 
enforced by the experimental implementation (i.e. tho granite table is of limited size). 


1.3 Contributions 

(n addressing tho research issues outlined above, the research reported in this thesis makes 
the following contributions to the fields of neural net works, automatic cont rol, and robotics: 

1. An adaptive neural-network -based thruster control system for a free-flying space robot 
is developed. This highly nonlinear complex control problem was solved in a very 
new way: by using a combination of conventional and neural network approaches, 
resulting in a ‘‘hybrid'* control system. The balance between neural and conventional 
approaches will, in general, vary from one application to another. At issue is how to 
deter trine the correct balance cin an applicntion-by-application bash. To address this 
issue, systematic evaluation aiteria have been proposed and demonstrated to aid in 
the system-level design. 

:L A now “Fully- Connected Architecture” is developed for neural network control. This 
architecture is a generalization of the standard layered neural-network architecture 
The value of the extra connect joins It offers is studied. Of particular import aitce for— 
control, .this new architecture allows for diircf pvt •programming of prior-known linear 
solutions. This benefit is used iii the robotic application to reduce dramatic ally- the 
lime required for adapt fction: a linear approximate controller is quickly calculated 
and implemented before training begins The major hurdle for successful use of this 
architecture, excessive cuuiplexitv, is addressed by the implementation of a systematic 
complexity-rout rol method ' hat manages the extra connections. 

There are <1 number of possible advantages to usirg prior information. Sinn* the 
network begins training with a irnsunably §;ood solution, initi.il performance **• *;ood; 
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and a bettor solution may result due to the better starting point for. the nonlinear 
optimization. It also serves as a bridge to conventional control techniques. Optimizing 
the network from a starting point that is a direct emulation of a conventional controller 
may facilitate valuable understanding of what the network is doing. 

3. A new algorithm was devised that now permits gradient-based optimization of systems 
with discrete- valued functions (DVFs). Gradient-based optimization of systems with 
DVFs is difficult because the gradient of the DVF is zero everywhere, except at. the 
transitions, where it is undefined. The new algorithm works by forming a Smooth, 
Continuous approximation to the DVF, and then adding noise during training. It has 
been applied to a, number of different applications; and each time, the value of noise 
injection is <>arly demonstrated. Although originally developed for application to 
the on-off thruster control problem, this algorithm for gradient-based optimization 
lor DVFs h. broadly applicable. Three applications are: 

• Training a neural network control system equipped with on-off actuators. 

• Trainings neural networks built with hard-limiting neurons. 

• Design optimization with discrete- valued. design options (proposed, not yet im- 
plemented). 

4. An experimental demonstration was performed, where the noural-rietwork-based con- 
trol system reconfigured i':$elf rapidly in response to multiple, major, destabilizing 
thruster failures. .Stability is restored within j seconds, and near-optimal performance 
is achieved within 2 minutes,. This performance Ls-nbtained despite the implement,!* 
tinmm a serial microprocessor j-lmph.unentntion on paraUeI-pracc*smg hardware would- 
provide dramatically faster performance. 

The experimental demons! ration pulls together each of the above routritm lions: #1 
led to the efficient system* level (hybrid) design that combines optimally the benefits 
of. both conventions’ control and nenr.il networks; #2 resulted in rapid recovery of 
stability, through the direct inftisuvi of a linear approximate controller; #3 allowed 
the use of gradient* based optimization with this control problem. The ability to use 
gradient informal lor ai a!l dramatically improved the rati* of adaptation (beyond 
what non-gradiefit-hnpod methods could provide) These advances, combined with an 
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automatic network-gi owing method, increasfe-lhe speed of the learning process to a 
point where it becomes a viable alternative for on-line adaptive control. 

Each.cojitrUiu.tior.. is addressed individually and presented in Chapters 3 through 6 of 
this thesis. 

1*4 Background or^ Neural Networks 

A brief background on neural networks is presented here to familiarize the reader with the 
biological motivation, history, and mathematical foundation of artificial neural networks* 
More complete overviews may be found in [22] [29] [07]. 

1.4.1 Biological Motivation 

Artificial Neural Networks are named after and motivated by the biological neural net- 
works that allow phenomenal computing performance in humans and other living organ- 
isms. Despite the relatively slow compulation rate of the individual human neuron, the 
human brain 's .sound and image recognition capabilit ies far exceed those of current comput- 
ers. The naturally fault tolerant and adaptive natuie of the parallel distributed processing 
model (both biological and artificial) make it well suited for ambiguous tasks or uncertain 
environments. 

The following lists highlight the different characteristics and capabilities of computers 
and the human brain. 

• Conventional Digital Computers: 

- Sequential instruction* 

- Digital 

- Add-csf: memory 

— Speed measured in nar.oseconds 

- Highly accurate 

- No<-ium e^arily fault tolerant 
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• Human Brain: 

- Massively parallel architect 

- Analog 

- Associative Memory 

- Neuron response times on order of 1 millisecond 

- Less &ccuT&t6Jthan computers 

- Fault tolerant, naturally adaptive 

Currently, conventional digital computers work by implementing a series of instructions* 
and provide highly accurate arithmetic and logic computations in cycle times on the order 
of nanoseconds. 

Biological neural networks are difficult to study, and not completely understood. What 
is known is that computations are carried out in parallel, with thousands to billions (e.g. the 
human brain has roughly lO 10 processing units (neurons;) and JO 1 ' 1 connections (synapses)) of 
low-precision processors operating with rise times on the order of milliseconds. The neurons 
communicate by sending 100 mV impulses* to other neurons. Since the magnitude of these 
poises is fixed, information is encoded in the frequency of firing. By comparison, modern 
microprocessors have typically 10 6 to 10 7 transistors, but only one to fou* compulations 
are executed ala time. This lack of parallelism is offset i»y the fast processing time onJ;he 
order of L-20 nanoseconds (50 MHz to 1 GH* clock rate). 

Despite t lie stow* processing of each individual neuron, the missive parallelism- results 
in certain computing capabilities that are impossible with conventional sequential digital 
processors. Some of those cc*pabilities -that are most-.nearly roachable with conventional 
processors aro: vision processing, sound processing, pattern recognition, adaptive; control, 
and planning. The key idea is that designing a compute! with some attributes of the 
biological neural network, such as parallel computation and adaptive capability, may yield 
greater success in these area^ titan trying tu push incrementally the? stale of the art in 
conventional computing hardware and algorithms. 

The potential benefits of a [laialiel ilistributed-procefi^ng approach create an incentive 
tocajit a problem into a form that can mo the computation a I capabilities of tins architecture. 


8 


CHAPTER L INTRODUCTION 


1.4.2 History of Neural Networks 

When people .began attempting sustained hcavder-thaiuaifc flight, the. first thought was to 
build an, aircraft modelled, after birds. .Early ornithopters attempted t:o reproduce the 
flapping* wing motions that allow birds to fly. These designs failed. The first successful 
solution, by the Wright brothers in 1903, used. instead a fixed wing to produce lift, with a 
wing-warping method to control the lift of each wing (similar to birds), but with an intcrnal- 
combustioiuongine-powered propeller for thrust. Most aircraft today resemble birds only 
slightly, in that they have a wing on each side of the fuselage, and the control system sits up 
front with the vision sensors. However, the propulsion system, control system, materials, 
etc. are very different. Using nature as a motivation was useful; but it ha* been important 
to incorporate the bast engineering available, and not rigidly follow the biological model. 

Similarly, one of the earhest ideas for building a computer was that it should he modeled 
after the human brain. Once biologies began to understand the basics about how the brain 
works on a microscopic level, early neural-network researchers modelled these neurons, and 
designed artificial neural networks. 

However, before they understood how the brain worked, artificial computing systems 
had been built in the form of mechanical adding machines. These produced precise coinpu* 
rations, one instruction at a time. As these mechanical linkages were replaced with electrical 
circuits, vacuum tube*;, transistors, and finally an integrated circuit consisting of many tran- 
sistors. the computational performance has increased dramatically, but the highly accurate 
and serial attributes have persisted. This development of conventional serial processors has 
continued in parallel with the development of nenrally-inspired processors. 

A *equence-of major developments in neuriUy -in spired computing follows, . 

In 1943, McCullough and Pitts modelled the neuron as a simple threshold device, and 
analysed the-coniputational capabilities of networks of these functions. 

In. 1948, Hi? hi) proposed a way for neurons to change the effect they had on other neurons, 
forming the foundation for a model of learning. 

In 1937, Kolmogorov's Theorem laid the mathematical foundation for neural networks. 
This theorem proved that net works -of simple ncuron-like processors arc able to produce 
arbitrarily complex functions of their inputs. [28]. This existence proof is described again in 
Chapter 4 , 

Around 1900, KoMUiblat* invented the Perceptrorn a simple neuron with binary output. 
An important feature of the Perception is tin.* simple Gaining rule that is guaranteed to 
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convene to a solution, if one exists [43]. The functionality of the PeiceptrOn is limited, as 
discussed again in detail in Chapter 5. 

About the same time, Widrow and Hoffinvented the LMS. algorithm for training binary- 
output neurons [18] [67]. This algorithm Was later applied extensively to adaptive filtering 
and control [68], and is the foundation of the backpropagation algorithm.. 

In 1969, Minsky and Paport proved the limitations of the Pexceptron: 1. some input- 
output mappings are impossible (e.g. XOll x ) with a single bidden layer, and 2. the number 
of Percept roms (neurone) required grows faster than exponentially with an increase in prob- 
lem complexity [32] [33]. 

In 1974, Werbos developed the backpropagation algorithm as part of his Ph.D. thesis 
in Economics [60]. Its discovery was not widely noticed until Rumelliart’s publication in 
1986.(46]. The backpropagation algorithm will be described again in Chapter 5. 

In 1982, Hopfield developed networks for associative memory. 

In 1984, Hinton developed the Boltzmann Machine, a type of Hopfield Network that 
uses an annealing learning process governed by Boltzmann statistics. 

In 1986, Ruinelhart developed the backpropagation algorithm for training networks 
with multiple hidden layers [46]. The hidden-layer neurons use continuously differentiable 
sigmoid function*; to permit the backpropagation of error signals used for training. This was 
an important discovery, as it removed the first limitation of the Perceptron model. Although 
VVerbos is often credited with development of the backpropagation algorithm, Rumolhart 
is credited with the development of it as a useful, tool for neural-network training, The 
backpropagation algorithm Can be traced back further to Bryson’s work in the 1960s with 
multistage optimization for dynamic systems [6j. 

The neural network field has expanded greatly since 1986, as many researchefsJiavo 
added capabilities to the backpropagation algorithm and experimented with opplicatiojis. 

1,4,3 Different Typus of Neural Networks 

Two major families of neural-network types exist: mernory-ba.sed and function- based. - 

Fu net ion-based networks include feedforward sigmoidal networks (used in this thesis), 
feedforward radial basis function networks, recurrent, networks, and Adaptive Resonance 
Theory (ART) networks (14). These networks work by attempting to form a function that: 


Rlu! EXCLUSIVE OR bfcic function. f(U.U| - o, f{u,l) = 0 
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“fits” the da.ta, dr training cases .they are presented. The hope is that; this function forms 
a generalization of the training data* and the. network will perform well on new data. 

Function-based networks such a a backpropagation- trained feedforward sigmoidal net- 
works can be thought of as a means ofTdata compression. For example, if IOQOl bytes of 
data. are used to train a network whose weights can be described with 100 bytes, the data 
has been compressed. As with all data-corripressio;i methods, this One relies on finding and 
taking advantage of regularities; in the data set - generalizing. If regularities do exist and are 
exploited successfully, the original data set may be repiroduccd to a high level of accuracy. 

Memory- based networks, include the Cerebellar Model Articulation Controller (CM AC) 
[2] [3]. nearest-neighbor interpolation, probablistic neural networks [51] [52]. and Kohonea 
Learning Vector Quantization [27]. Rather than learn a generalizing function of the data, 
these methods store examples of the training data in memory (for example, input-output 
training patterns).. When presented with a new input training pattern, nearby training 
patterns are recalled from memory and the output is a function of these patterns a 
linear interpolation among the 5 nearest neighbors). The specifics of the processing during 
learning and recall vary among the architectures listed here. 

Briefly, the tradeoff is that memory-based approaches learn very quickly since they 
simply remember each training input,, but the recall can be much slower, since the near- 
est neighbors must be found and then an interpolation performed to produce an.. output. 
Function -based approaches train more slowly, as they must compress the data into the 
functional format created by the network topology, but have very fast recall, Also, the dis- 
tinction between these groups is sometimes blurred, as some systems involve a significant 
amount Of processing, but may br- built around stored training examples, 

From a controls perspective, function-based networks fit better with existing methods, 
providing a generic nonlinear control element. Function- based- neural -network controllers 
have been, used in many applications [lb] [1 7) [38] [41] [oij [62),(67]. However, CMAC [2] [3] 
is om? example of a memory-based neural network that has been used extensively in control 
applications [23]. 

Feedforward neural networks 2 built with isigmoidal activation functions (as described 
above) were used exclusively in this research. Due to thoif general function-approximation 
capabilities, it was deaf that they would work well for this application. However, another 
reason for their u.se here if that they have been used successfully for a wide variety of 


^Those employing no internal feedback. 
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applications, and do appear to hold much premise for neurabnetwork* Control applications in 
particular. Other neural-network architectures exist of course, with different characteristics 
that may prove to offer advantages depending upon the application. 

Radial* Basis-function (RBF) networks are similar in that they have a feedforward struts 
ture. hut the activation function is different. -A sigmoid forms a kyptrpfone (i.e. a point in 
3-D space, a line in 2-D space, a plane in. 3-D space, a 3-dimensional hyperplane in 4-1) 
space etc..) that separates the mapping space into high and low regions with a transition 
region near the hyperplane. A radial basis function (typically a Gaussian function) pro- 
duces an activation near a certain point in space (i.e. & line segment in 1*D space, a circle 
in 2-D space, a sphere in 3-D space, etc.). Statistical or iterative methods may be used to 
choose the centers of th«,e radial basis fungous, and the weightings of these basis func- 
tions- may be calculated directly or iteratively. Tlic.se can be significant advantages over 
sigmoidal networks for some problems that happen to fit veil with, the functionality offered 
by these networks - namely one- or two-dimensional mappings. However, a. major problem 
with RBF networks is that large numbers of hidden units are required for high-dimensional 
input spaces. This can be understood by considering how the relative volume of a sphere of 
influence of a fiBF.decpea.ses as the dimensionality of the space increases. The problems ex* 
tending to high-dimensional input spaces provided a motivation to.avoid RDFs in the study 
of general neural-network* control issues in this research. However, for a low-dimonsional 
Input space (3-D for this application, >3-1) Cot_a b*dof robot), RBFs may be viable. 

These and other different neural-network architectures have many common aspects (e.g. 
the issues of overfitting or system-level. design), and therefore, many conclusions of the 
research- luir(i-.will-bim]irectly applicable to these different architectures. 


1.5 Readenis. Guide 

This chapter has served as an introduction to the research that is presented in this disscr 
tntioti. Th? remainder of this thesis is organized as follows: 

In* Chapter 2, the experimental equipment (i.e. the robot) that provides the “challenge 
problem" is described iu detail, and the particular thruster mapping problem addressed is 
presented. 
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In Ciia.pt.er 3, the generic issue of neural network value to specific control problems is 
addressed. Criterisuare presented that will aid the control systems engineer in the system- . 
level design of each, given control system, deciding which segments, if any, it will be beneficial 
to implement, with neural networks. 

In Chapter 4, the new concept of “Fully-Connected Architecture” (FCA) is presented. 
It is used with backpropagation, and is shown toiiave-greater functionality than a standard 
layered network. Benefits of the FCA are outlined, with emphasis on its advantageous 
applicability for control. 

In Chapter 5, a new method is presented that allows backpropagation learning with 
systems containing discrete- valued (and therefore not continuously differentiable) functions 
(such as the on-off thrusters). This enabling method requires, only simple modifications to 
standard backpropagation, and extends to multiple layers of hard-limiting neurons or to 
the FCA with no need for modification. 

In Chapter 6, the reoonfigurable neural control system for the free-flying robot is. pre- 
sented. It draws upon each of the developments detailed a hove. Its good experimental 
response to drastic destablizing changes in the thrusters verifies rather dramatically the 
viability of c-acli of the new contributions made. 

Chapter 7 Concludes this dissertation with a summary of results and recommendations 
for future- research. 
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Chapter 2 


Robot Control Application 


The control ta sk addressed in this research is the control of position and attitude of a 
free-flying space robot using on-off thrusters. The challenge presented here is to (abruptly) 
damage mechanically a number of thrusters, a.nd then have the control system autonomously 
and rapidly reconfigure itself in real time, so as, to maintain good control throughout. 
Moicomx some thruster failures are strongly destabilizing, which place.? high demands on 
the speed of recovery. The experimental system is shown in Figure 2 .)* and an example 
tlirus-ter failure mode is shown in Figure 2 / 2 * 

Control using on-off thrusters- is a cothplc.w nonlinear problem that is important lbr real 
spacecraft |G31, and the nonlineai and adaptive capabilities of neural networks make them 
attractive for this application. 

The robot used-here ha? in fact previously been -successfully controlled without the use of 
neural networks 'AG). However, rho (conventional) method relics on geometric symmetries in 
the thruster layout and does not scale well to thruster controller; with .higher-dimensionality 
in the input and output spaces. A neural-netuork- based approximation method does scale 
well to higlior dinendonal t-hruxter controller', and doe* not io|v upon geometric synr.ue 
tries, so it provide n-shurt tire conducive 10 roronfigur-tk]'* control. Additionally, the neural 
approach otToi* computational flexibility, since the network can be designed with (lie de- 
sired spepd/amirnry trade-off. If implemented ir. parallel hardware. it car. be made to be- 
extremely fast. 

'I bis challenge problem was i :iom’U us an iiid : n high lighting and defining some of tlu* ret 
evntit issuer lit li&iira! network control. It aliu senes to facilitate di.-cu-siun and explanation 
of t!m utuual net work control .devcjhjpn:t»:;f s made in Hie course of this research. 



14 


CHARIER 2. ROBOT CONTROL API'UCM IO.y 


% 



1 


- f si ««4 - • * 

, * .. 


Figure 2.1: StanforcLFren-Flyiiig Space Robot. 

77ns highly autonomous iftohite tohot operates in the horizontal plane, using an 
nir-cu^hion suspension iv nitnul&U* i he draftee and jcrchg.chnracfcriMir* of ire*. 
/( m full} $vlf*cani tuned planar laboratory-proto type of an autoiminous fre(-fl t \wg 
spzce robot complete u ith nti-honrd g.xs, thrusiets, electrical power, /mi/ti-piw^wv 
computer auueia. winder Ethernet daia/rommuAir Miens link, and 'uo 

a* • t • eta ti;n; i/ianipnlat * rs 



Nominal Configuration After Multiple Failures 



Figure 2:2: Example* Failure Mode 

Magnitude and direction of each of lit* eight ihrustets is indicated hy the length and 
direction of the lightly shaded triangles. Thruster failures were simulated fnediani- 
cully with weaker thrusters and 90'' and. 15° dhows i. Some of the elbows destabilize 
the rob O 4 by changing l he sign, of the thrust in the v direction. 

The field of neural network control h \ast. so the scope of this research has boon limited 
to the use of feedforward neural networks 1 for a specific. application. End-to-eml devel- 
opment of a neurahnetwork controller for a rt;a ! * complex application highlights the truly 
important issues for this npppcniicm. ami then* issue*, are relevant to other real-world ap- 
plication*. Where possible, information will he pmvit.etl io allow extension of T he-e de\c! 
opiimntb to other applications. 

Sevtval specific attributes of the challenge problem common loo* her coniio! application* 
include: 

1. The complete oruit ictl .system is complex, involving the integration of several subsys 
terns. Its level of comp!exit> i.-> similar Io real-world control applications - it has 
rpf|Uirenu’Ms for high-level human interlace, trajectory planning* system idrinifca* 
tion. and roronfigumtio i *draK j g;\ us well <m low -level control. 

2. Practical issues such as sch?or aitegi ation t sample-rnte sn|rMjntt. itip'it/cmt put rmiliol, 
and processor wlmtion, are veiv much present. 

is n*twf>ik< v.iih ho ii.kmal fc.dlu k su-di as directly fiom rwv.ork c:*tputs k. tiHu*>rk m; ut* 
ivt <U:>nv t il ,t ir-tu.-tk* tt i/f tie ••?»«* part nf 1 fr>riU*4< k « ^nl rul In.p 
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Much relevant control theory exists, in addition to specific control knowledge regarding 
this control application, 

*t. Information abc»uta]u’jchati^ml plant will-need to be extracted through an identifica- 
tion process $o the? learning task is evolving continually. 

r>. Rapid adaptation h required to regain stability and prevent \ lie system from damaging 
iw!f. 

ti. On-ofT actuators present a mm -differentiable function th.it lead.- to problems with 
current learning algorithms 

The complexity o r the lesearch task generates the requirement for a basic strategy in 
addressing this control problem: Tin* system-level issues in items 1 through 1 are handled 
with a hybrid approach that involves an analysis at theuystem level of where the neuraJ net- 
work can contribute, segments the problem, and makes full use of conventional control and 
syslnm identification methods. To address isaue number 5. a modified network architecture 
is developed to provide fast initial learning, and ?,q allow i tiitial infusion of a pie-calculable 
*-t<ibili;:ing controller. To address issue number <h. a new algorithm is developed to perform 
optimisation with the or -off thrusters, while still allowing the use of gradient .information 
to a.ccolrrnfo the optimization. 

ihis chapter lias three major .-ection*: 

I. The control application and experimental system (robot) hard wnfe are described. 

•>. The thruster mapping problem at the renter of the control application is defined. 

:l, A. solution framework is jinwtr.ed , including three separate solution umi4 od.- for the 
tlirmler ii upping problem. 


2.1 Ex perime ntal System 

fiie experimental system used to study issue* in autonomous navigation and control of 
free flying space rohois j«; shown in Figure 2*1. The design and const ruction of this cobot 
are discussed thoroughly in i.V-j. in that work* Fllm/m design'd and built t lie robot, avid 
v iiv** it the capability to intenepi and capture a tree loafing object autonomously. Tim 
i ti K ma;u hardware modification requited to perform the rvpnimuttu deuiiberl Imre w,r» 


* 
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Uninstallation of accelerometers and an angular-rate sensor, Those sensor? are used in the 
identification of.tho rlia rarteristics of each thruster after mechanical thruster failures occur. 
These minor hardware modifications allow the robot to sense the acceleration resulting from 
each of its thntsuu*N thus enabling the iwnifigiiiable control system that, is the focti> of 
this application. 

Operating in a horizontal-plane, the mobile robot simulates the drag-free and zero-g 
characteristics of space: it exhibits nearly frictionless mot ion as it floats above a 2.74 x 
meter (1) x 12 foot) granite surface plate on a 50 micron (0.002 inch) cushion of air. li 
is a fully solf-coutaiued planar laboratory-prototype of a free-flying space robot complete 
with on* board gas supply, eight colcbgas thrusters for propulsion, electrical powetv multi- 
processor computer .\vs-.em, on-board camera, wireless Ethernet data/iroinu uniraUoris link, 
and two cooperating riuuipulators(5fi). 

The robot has a mass of 70 kg, and is controlled with right thrusters, each nominally 
producing 1 Newton of thrust,. Potion feedback cornea from a pair of CCD cameras 
mounted to the ceiling jvhove the robot, Two cameras arc required to- cover the total 
surface area of the granite table. The cameras detect a pattern of LEDs mounted to the 
top of the; robot. A custom vision processing board processes the camera output, and 
products position informal ion at a (it) H/ update ran* that is accurate to better than 1 mm. 
This f.r, y, te] vector is .digitally filtered ru.d differenced to produce e. velocity vector. Tin* 
processing is performed off-board aid thou communicator back to the robot via a Motorola 
Altaic wireless F.iheine* dal h/comtuuniciii ion* btik. 

The specifics of the control- system cumpuaents aie described ii. grea'er detail In Chap- 
ter (). 1 1, is. section will focus on the hardware cent i a! lo flu* lerou^gi; ra(>h* ror-fto! yv c terru 
! In? thrusters and tin* accelerometer*. 


2. 1 , L Tlif listers 

Central io the control sssf^tn design ate tin* a< luntot*'* tliefnselves, a - shown in Figure 2.’,h 
# Eight omofi: air thrusters are used, to provide redundant actuation in all three degrees ot 

freedom of the base, Each thruster produces about i ,N of'hrusf. ard can operate efiectiN el v 
at rates up to Ul H/. Fui the purpu*/»s of thin control application, they ca:. be modelled as 
pun? on nfTact'Jatuu.ignoiing tniusieni effects. However, the Iran- iun* e/Tec's will be .shown 
41 to impart, vlrtlion of * lie sample rn*e ai d d M Mpn of the f.lteih ns-d fui th»* inieleinriie'er 

signals. 
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Figutd 2,3: Photograph of Thruster Assembly 

Guo of the eight coldgai- thruster Assemblies is shown. The bras* hexagonal plug 

with 0 hole* is the thruster twzzle. The brass valve assembly is behind it, ami ® 

the solenoid is to the right. The entire assembly is mounted with an aluminum 
bracket. Gas used is air at 690 kPa (100 psi) reservoir pressure, exiting to one 
atmosphere. The converging diverging nozzles are designed uifh nn exit velocity of. 

Mach 2 , resulting in one tfewtou of thrust per thrust it fad]. The solenoid wjvo has 
a r etpunsr tithe of about 5 ms. 


The nominal tlltu: { tpr nozzles ,ue described in. [36] . The. six <onvnri4m;^divergiug open- 
ing? in each-nozzle wer p fnncliinoj with a custom form tool. The expamion ratio of 1.7. ® 

re;ervuir pressure of GOO kMn i inn p«tj), ;»nd exit pressure of HU l.Pa (l 1.7 psi) are designed . 

io >ioid an exit velocity of Mach 2. 1’igiav 2.3 shows an individual ftirunuu.-a$setnbl\\ iiu 

( hiding a solenoid valve that controls the flov.* through the nn22le. TV* solenoid, shown to 

tlic_rigltt of i fa* salve, is spring loaded to stay closed, and opens, fully in about 5 t as when • 

rurem is applied. The valve has a choke point of about 1 .GG mm |0 .()ij?> inch) diameter. 

On:- of iUi pails of thruster awcmldi'^ that U located at each of th? four comers of the 
robot is shown in /Mgui* 2.-1. Mite noftina! layout of all eight thrusters ran bo soon in the f 

kf* side of 1 ipwr 2.2. 
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Figure 2A\ Photograph of Two Thruster Assemblies 


To study control reconfiguration, a number of “failed" thrusters were built to simulate 
different. failure modes. These failures include: zero thrust, reduced thrust, 45° misalign- 
ment, ant! 9(T misalignment. The hardware used to simulate physically these failures is 
shown in Figures 'l.h and :2.(i. 

The use of a converging-diverging design resulted i.n a performance increase of 6.5% |56\ 
This ma.y be significant for thrusters that are to bo used every day, such as the nominal 
thrusters on this robot. However, the ‘‘failed*’ thrusters wit h-Off* nominal thvusucharacteris- 
tics were built with straight walls formed by drilling with standard bits ranging in diameter 
from 0,25 rhrn (0,010 inch) to 0.69 nun (Q.0’27 inch). Thfmdou; were tested on the robot, 
measuring robot acceleration to determine the thruster strength. 

li was not posable tu build flmi&tm with greater thrust ..capahijity than about l.i 
Newtons by iu»//.Ih mndificHiion aioiw. As more air is required, the choke point in the valve 
causes a greater pressure drop across- the valve, and less across the nozzle. A:; more openings 
were added, and t ho total ;aoz 2 le area increased, thrust peaked at 2 Newtons with a INOV! 
increase in area beyond nominal, and (licit declined. Obtaining greater thrust would require 
machining a larger valve orifice, or complete replacement of the solenoid -valve assembly. 

Completely failed IhnMois wore simulated by nuzzles with a sirgle Q.T5 mm (0.010 inrli ) 
diameter l.nle ra: tier than being phiywd mmplefely. This resulted in ahnut O.0T5 Newton of 
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Figure 2.5: Thruster Failure Modes - Reduction in Thrust. Level 

Thru.srcr fatfureRaresj'muJatediv replacing the noriima/ tftritffar nozzles with me- 

chpt'ucnlly altered nozzles. The first thruster, has a Single Q.2& tiiift (0,010 inch) # 

diameter holt, and simulates a complete thruster failure. The’ second thruster has 
three rl.tfi) /n/n (J102? holes, simulating a reduced-strength thruster. The thitd _ 
thrust 1 .': i* a no/m/j .1/ thruster, with 0 con vsrgifi ^diverging hob*. 


« 


thrust. which was 1/40’ h of nominal. and fcfleclivvly zero. However, tho prison cr-of a small 
hole mean.'* tho thruster ran bo heard to fin?, allowing an observer a bettor nnd?rsi afidin^ 
of the itlcntifiijution-aud recunfigur h f ion prows. 

I lie volume of the chamber between the valve and ilie nozzle opening, has a transient 
effect on Minster performance. When the valve opens, it takes a finite length of time for 
the pre‘;Mne to rise to ihe stoady-$*at«? pressure (which is defined by the reservoir pressure — . 

minus the pressure losses in plumbing and across tie valve). Similarly. thrust continues f 

a. M er the. valve closes. uhiV tl.o chamber emptier. Tills effect may be sren in Figure 2.111. 
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Figure 2.6: Thruftter Failure Modes — Change, in Thrust Direction 

Thr inner failures are simulated by addins: elbows to change physically the direction - 
of thrust. The 4? and Q&' elbows simulate severe (and potentially dcMMiixittg) 
thtiiftcr misalignments 

Since the 45° and-QO* dhow? used to simulate thruster-failure increase the volume. of 
this chamber* this effect is increased significant ly to the point that it is greater than the 
sample period of 100 ms. Fortunately foi the system ID process, thrusters tend to remain 
in the on position for several sample periods, so the transient effects can be tolerated. 

2.1.2 Accelerometers, Angular-Rate Sensor 

Accurate acceleration information i< crucial to 'ho identification process. Acceleration data 
are used to identify thruster failure? and build a model of the robot for reconfiguration. 
Is.*ue:; such as sensor noise* sensor placement* 'ample* rate selection, mechanical vibration, 
ulvct i ieal noi&e. and thrust oi transient characteristics all contribute to the difficulty in oh 
tainirg . (vura’e aoa’ieiation Mynah. 
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Figure 2.7: Accelerometer Photograph 

Photograph of the Systran Lfonricr 4Z10A linear Servo Accelerometer. Actual size 4 

is its shown: overall length is* 76.2 mm (3.0 inches). T wo accelerometers are mounted 
to the robot ha<c to measure ttaimf&tuwul acceleration*. 


Two Svsiron Donner-4310A Linear Sorvo Accelerometers are used. These arcHciomc- 
tors, shown in Figure 2.7, have* a. range of ± 1 g, and .die accurate to hotter than 0.1 mill:*# 2 . 
The accuracy of acceleration measurements is limited not by the aciclcrottietm. but by the 
produce of extraneous vibrations. l or ('.sample, the small cooling.fan in the wireless Eth- 
ernet receive.': at the top of the robot produces a 70 Hz vibration that, is clearly measurable 
«d acrrle.roui'jter mounting positions on the robot base plate. 

As with all Syslrou Dunner accelerometers, f ;hc 4M1QA uses a force balance. A proof 
mass is suspended within the accelerometer, and moves slightly in response to acceleration. 
d.s depkiod ill Figure 2.*. This displacement is measured by a position detector, and a 

1 A-full «r»t o| spf'uficjttioni. is pr' ; 'i*»nl ^r) i n Ap|”'nHi< B 
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control circuit .and torque coil are used to drive the displacement to zero. The control 
current .used to keep the proof mass from moving is amplified and used as the accelerometer 
output signal, 



Figure 2.8: Accelerometer Circuit 

The servo-control circuit contained within the force balance accelerometer is shown. 
The control current used to keep the proof mass from moving is amplified and used 
fis the acceleration signal 


A Watson Indust ries angular rate senior (model # ARS-f.T;jl-l AV ) is used, lhis-device, 
aho called a tuning-fork. gyro. vilu&Les u.njnhig lork and tncaMiro.s.the Coriolis force.on each 
of the beams ns the fork rotates, thus producing the angular* rale signal. Accuracy is better 
than 0.1 Vsec, but this needs to be differentiated to obtain angular acceleration 

The accelerometer signals and angular* rate signal pass t hrough analog pre-filters with 
two critically damped polt*^ at 75 Hr. They are f hen. sampled -by the A/D converter at a 
200 Hz sample rate (while Urn control loop runs at. 10 Hz), The accelerometer signals are 
digitally filtered with fourth-order But i ei worth filters with pnlo:; at 2o Hz. and the* angular 
i ate signal is digitally filtered with a second order But torwottli filter with polos at 10 Hz, 
Angular accelerat ion is obtained by a first differemo of the rain signal 
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At this point, the filtered accelerometer signals are combined with.thc-angular-rate and 
angular-acceleration signals to produce the base accelerations in [,r, y , V-]. The coznpjita- 
r.ion? made to combine these signals are highly dependent upon sensor placement, so the 
sensor* were placed to yield tho highest possible accuracy^ as shown in Figure 2.0. 

The accelerometers measure acceleration in one direction at one location On the robe* 
base, so the basic task is to convert these acceleration signals into acceleration at the c* iter 
of the base. If it were practical to locate both accelerometers with their proof masses 
exactly coincident with the robot m&ss center, one pointing straight ahead in -fx. and the 
other pointing in +y. no compensation would be required. This is not practical, so the 
compensation requirements are: 

1. Remove angular- ■acceleration effects (needed if the accelerometer measurement axis is 
not aligned perfectly with the center of mass (c.m.), i.e. has a tangential component). 

2. -Remove centrifugal- acceleration effects (needed if the proof masses are not located at 
the c.m. and the measurement axes have some radial component - o.g. these effects 
occur even when the robot, spins about its c.m. with no acceleration of the c.m.). 

3. Rotate translational-acceleration vector to robot frame (needed if accelerometers are* 
not aligned with x ami y axes) 

In theory, the accelerometers could be placed anywhere on the base (a? long as they are 
not perfectly parallel), and centrifugal-ami angular-acceleration effects could be subtracted 
by calculation. However, due* t.o thoJiffuiejicesJn accuracy for each type of sensor, choosing 
the correct configuration .will result' in hotter acceleration measurements. Taking these 
factors into consideration it was found that: 

1. Angular acceleration, efTeUs-vvouki be difficult to compensate due to a relatively noisy 
angular nmdcrai ion signal, for this reason, the accelerometers are aligned accurately 
with tho.r.fn. uC. the.ru hot, eliminating any angular acceleration effects. 

2. The angular-rate sensor (AltS) provide* a dean signal, so centrifugal acceleration 
effects can be accounted for by computation. However, the effect is proportional 
to the radial distance from the proof mass to the c.m.. so the accelerometers! are 
positioned as close to the c.m. as possible. The distance is 'IG.o mm f 1 .S3 inches). 
An additional complication is ill* saturation of tin* ARS. This usually occurs only 
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Figure IJ.9: Accelerometer Mounting Locations 

The accelerometers are mounted orthogonal to each other , with their seismic centers 
as close to the robot center of mas* as possible, and aligned radially with the center 
of mass. Thin facilitates the removal of extraneous acceleration signals (i.e. from 
centrifugal and angular-acceleration effects) by minimizing their size and providing 
good sensei's for their removal. For example, the angular-rate signal is cleaner than 
the nngufar-accelerat ion signal, so angular-acceleration effects arc zeroed by align- 
ment. with the center of mats, while centrifugal effects are cancelled by calculation. 


when the robot spins out of control, before reconfiguration, but some sensing is needed 
(both for.contrifugal compensation and for angular* acceleration measurement), When 
saturation U\ detected, angular raie and acceleration are obtained by digitally filtering 
the vision- system position signal. The angular-rate senior is used when possible, since 
it is one derivative closer to the measurement needed, ^xul therefore less noisy. 

;3. Rotational transformation i« accomplished with a 2 x 2 transformation matrix. 

The resulting accelerometer mounting locations arc shown in Figure 2.9. -The calcula- 
tions used to go from the tommi-s to the final acceleration signals am shown graphically in 
Figure 6.3. 

This reconstruction of the acceleration vector is carried out a t a 2(10 Hz update rate 
on-board the Tobot. Examples of dynamically corrected and filtered output from the ac- 
celerometers am) angular-rate sensor an* shown in Figure 2.10. 



Acceleration (rad/sec/sec) Acceleration (m/sec/sec) 
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Time (secs) 

Figure 2.10: Translational and Angular* Acceleration Signals 

Shaded areas indicate th ? sign and duration of a thruster pulse. 100 riw is the 
ifiinhUunidcngth pulse ust'd for control Lag is due to the transient response of the 
thruster and the effects of the analog and digital filtering Acceleration prrw St* for 
longer tijart t/ir t/mtSftfr pu/se '.t’jcifJi due to the. finite chamber size between the 
valve and nozzle m the thruster assembly. This data is still noisy t far filtering, 
but leads to accurate identification when used with the linear* regression process** 
described in Chapter 0. 
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2,2 Thruster Mapping 

2.2.1 Problem Definition 

The three degrees of freedom (*,y, of the base are controlled using eight thrusters posi- 
tioned around its perimeter, as shown in Figure 2.11. Each thruster produces both a torque 
and net force on the robot. This coupling, combined with the on-off nature of the thrusters, 
substantially complicates the control task. 



desired force/- 
IF*, Fy,Ty] 



thruster pattern 
(Ti, T 2 , ... Ta] 


Figure 2.11: Thruster Mapping, Problem Definition 

At every sample period , thu Thruster Mapper takes a desired force vector, 
IF*,.,. F V jm* and finds the thruster settings, (7'j, Tj? t ... , Ta], to mini- 

mize a specified cost {unction. The on-off thrusters and coupling between forces 
and Vnvquft make this problem difficult. This mapping is calculated several times 
per second, motivating the development of a nonlinear approximate solution that 
can tun in real time . T/ic.fhrusrer mapper must adapt to changes in ih Muter du\r- 
actcriStics Development of a neural network to implement this Thruster Mapper'* 
is f/ie focus of this application. 


The thruster mapping task, also shown in Mgn.ro 2.11. that must be performed during 
each sample period i« to take an input vector of coiittmiout* vn hi<»d desired forces and torques, 
[Fzjt , , and find the output vector of discietn-vah^d (off, cm) thruster values. 

(2’i, 7*2* ... 2b], that miniitliaOd a specified cost function. 

The robot- bas>?‘toritr£>) strategy developed for this system is shown in Figure 2.12. The 
complete control system is described in detail in Chapters 3 and 0. A propoftiot nl derivative 
control law produces a continuous vector of desired furcei;, based on position and 
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velocity information from the overhead vision system. The thruster mapper takes this force 
vector and outputs the pattern of thrusters to be fired on the robot. 

Partitioning the controller into a “control module" (PD controller in this case) and .a 
“thruster mapper.” greatly simplifies controller design since both components can be de- 
signed independently. Smooth-actuation .is still possible due to-tlie low thruster impulse, 
which results from high sample rate (10-00 Hz), low thrust (force per thruster, F -■ 1 N: 
torque per thruster, f a 0.14 N-m) and high mass (mass. M 70 kg: moment of inertia, 
/_ss 3.1 kg-m 2 ). This strategy was originally developed as part of a conventional control 
system for the robot (56). 


desired 


desired 
force vector, 


thruster 



I-’igure 2.12: Robbt-Base-Contral. Strategy 

The control module treats the thrusters ns linear actuators. The thruster mapper 
must find the thruster pattern producing a force closest to that rei/ue.ifed by the 
hn.se control module. 




2.2.2 Cost Function 

Since each thruster ran output only full thrust (nominally 1 Newton) or nothing, the thruster 
mapper is not capable of exactly producing the requested force. The basic approach to this 
problem is to define a cost function, mid then to find the thruster pattern. (I'j, '.t 3 , ... , '/#]. 
that minimizes this function. The specific search of neural* network functional mapping used 
to "find the thruster pattern" will ho discussed in Chapter 3. In this research, ft general 
cost fiineiinn was used t hat incorporates the normalized force error vector and the amount 
off ;as used. 'Ibis function is shown in liquation 2.1. 
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where, 
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thruster-in apjping performance cost 

binary thruster values, [ ft ft ft ft ft ft ft ft ] 
thruster number 

net force error in x- direction, resulting .from T 

net force error in y-direction, (F Vi ,, ~ Fy.ci)’ f <!S, iltiiig from T 
net torque error about V’-axis, (r v , 4o -- f Val ), resulting from T 
normalizing factor for Ft 
normalizing factor lor F„ 

(Ormalizing factor for r\ 
gas-weighting parameter 


In matrix form, this can be. expressed as Equation 2.2. 


where, 


/-* Fcrr(Tj T N F« r (T) f 


i-t 



Ferr(T) 


N 


(Fr.,, (T) F Vttf (T) T V«rr (T)] z 



(I 


. . l 

ft 

’ Hr\c*n 


0 


G 

0 




force vector 


normalizing matrix 


(2.3) 


If tit? robot v‘oN* equipped with litieai actuators (i.e. ^proportional thri:$ters M ) 4 <•> vector - 
of contiiiuuus-vaiuOi] actual forces, (/V*r f ? r v* f( ]« could be produced that exactly 

equalled the desired force vector. (/\ We , 4 4 r s t . 4 .J t jcquested by tho controller (i.e. J =: 0), 

However* perfect mapping is not generally achievable with discrete- valued thrusters, and 
the weighting parameters srlerted ill the cost -function define the dirtfibutiojj of error (if. 
tranulationa] force mo* rutnilot>;)( forc e error vs gas usage). Solectiun of the normalizing 
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factors and gas- weighting factor in Equation 2.1 define the cost function ami the resulting 
optimal thruster mapping. 

Throughout .this thesis, the normalizing, factors used are the nominal force ami torque 
values produced by.firing a single t hruster. These values are indicated by Ftkr*tur > force-pev- 
thrustot, and m rV sur, torque-per-thruster. With no weighting on gas-usage, this results in 
the minimum -length force-error vector in normalized-force space. This is a simple, straight- 
forward method that results in a good thruster mapper, and is used for analysis purposes 
in Chapters 4 and 5. This is shown in Equation 2.4. 

min J + (^) ! + (^)l (to 

T [ \ fthv YJf*r / \ /’thru*!** / \ ^thruAttr / m 

For the experimental implementation, discussed in Chapter fi, an addition.a.1 practical 
issue is present: gas usage should be reduced if it can ho achieved with minimal effect on 
force-mapping performance. To achieve this, an additional cost is placed on gas usage, 
so that, if two candidate vectors produce similar size force errors, the more fuel-efficient 
one will be chosen. A good balance between control accuracy and gas usage is found with 
ctgit - 0.5. This cost function is shown in Equation 2.5. 



i 8 


(2.5) 


In minimizing the force error only, the thruster mapper does not consider the dynamics 
uf the plant. It assumes that the vector output by the controller feedback law is 
chosen carefully enough that it needsonly concern itself with producing the closest matching 
E a .. t . Jn this application, tli* controller component is a simple proportional-pluit-dorivative 
controller (shown in Figure 2.12) that dons not take into account (he thruster limitations. 
Ideally, the controller component would be aware of thruster limitation!., possibly leading to 
a merging -of the control t.nd mapping component!.. This, complex nonlinear control problem 
is not addressed here, but a first step is proposed in the form of a modified cost funrtion in 


Appendix A. 

In summitry, the cost function was chosen io he the length of the normalized force-error 
vector augmented by a cost on gas usage, where the normalization factors were the force- 
per- thruster. l\ht u ,ttr- and t.or<|tm-pef- thruster, Fof neurabnotwork analysis only, 

till- cost function shown in Equation 2.1 was used, lor experimental implernei tation, the 
function shown in Equation 2.5 was Used, reducing gas us.ige, 'Ibis tin us ter mapper trades 
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off force error for a reduction in gas usage, just as an optimal controller balances error with 
control effort. - 

Selection of the cost function, defines the correct thruster pattern for any.given Padre u 
vector. The mechanics of how.this Correct vector is actuall y fou nd (i.c. search or neural 
network-based functional approximation) is described below. 


2.3 Solution Strategy, Mapping Methods 

Ths reconfiguration strategy proposed in Figure 3.1 requires an “Indirect Training" ap- 
proach, where the neural network attempts to find the best mapping based on the latest 
estimate of the plant model, and then adapts itself to optimize mapping performance. This 
indirect training approach is shown ms the top part of Figure 2.13. The word “indirect'* 
here refers to the lack of an optimal teacher, so the network adaptation, is directed by 
experimentation (in simulation) with a model of the plant. As seen in Figure 2.13, the 
network’s thruster pattern is passed through a model of the robot,, and the resulting force 
vector is compared with the desired force vector, resulting in the error signal used to train 
the network (without the direction of an optimal teacher). 

While “indirect learning** is the ultimate goal here, two other methods, “direct learning" 
and “exhaustive se&ich,** are developed a.s steps towards of this goal. AU three methods are 
summarized in. this section. 

In the development of an indirect trailing procedure, Several issues munt be addressed, 
including neural-network architecture and optimization (also referred to as training, learn* 
iilg, or adaptation). Td “separate variables/’ ami permit the study of these generic issues 
separately, an intermediate step, “Direct Training/* Is introduced^ This step, tdiown in the 
middle part of figure ?.13, permits the development of .neural-network arcldtecture selec- 
tion and optimization procedures which can then be carried over directly to the indirect 
triiining pfoblrrn. 

Iii direct training, the network is taught simple to copy an “optimal teacher." in this 
case the optimal thruster mapping. To obtain this optimal mapping, a search must In 
performed over all possible thruster cutuhinatiofls. Fortunately, When all thrusters are 
working correctly before the remit figuration due to thruster failures), symmetries exist 
that can simplify the search prrirrWs. This non- neural* network appioach as shewn in the 
bottom pfift of Figure 13. 
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INDIRECT TRAINING 


Fd« 


f neural T 
Inetworky L — 



robots Fact 
model > 


DIRECT TRAINING 



Fdea: desired force, . 

[Fades, Fyrles, Tydes] 
Fact: actual force, 

[Fxact, Fyact, Tyact] 
T: thruster values, 

[Tl, T2 TS] 

Topt: T that minimizes 
the cost function 
error: signal minimized 

to train the network 


EXHAUSIVE SEARCH (optimal solution) 


symmetry 

l>re*proces« 


V ypptlinvugti 


search. forYf V symmetry^ iT (a Topt) 
~\post-jproce8fl i J 


i optimal mapping 


Figure 2.13: Thruster-Mapping Methods 

Indirect training (t c. with no. optimal teacher, adaptation is bused upon perfor- 
mance with the robot, model) is the ultimate goat, but direct training is used to 
stud \ architecture and optimization issues, and an exhaustive search (symmetry- 
aided) is used to generate the uptitn&l mapping required by direct training. 


These three different techniques have also been used to make possible evaluation of 
performance and comparisons. Duo to the discrete nature of the thrusters, even the optimal 
thruster mapper results, in significant errurs . This optima) performance h’vcl is used to 
evaluate the performance of the neural- network control system. Also, use of the direct 
t raining performance as a benchmark .for evaluation of the indirect tfain'mg_perfonna«ce 
allows study of the isiiue:? involved in indirect training. 

Although the final goal Is indirect training, the methods need to be developed in reverse 
order, i.e. (1 ) optimal search, then (2) direct training, then (3) indirect training. Kadi suc- 
cessive method builds upon knowledge gained in the previous 6tep, a? they work towards 
the final goal of indirect learning. Tim-first step contributes tho robot-base-contiol strategy, 
and at optima) solution to be used as a benchmark. Thosf cond step contributes under- 
standing of architecture .nid optimization issues. T l.U-finaL&tfep contributes a new learning 
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algorithm to.accommodate-t.hc on-off thrusters. The result is a learning system that can 
be used in the ^configurable control application. Segmenting the problem in this manner 
resulted in a “separation of variables,” and allowed for concentration on one issue at a time. 

These methods are presented in the order they were developed, since each one builds 
upon the previous step;, but the final method, indirect training, is .the. one used in the 
^configurable control system required when thruster failures occur. Presenting the search 
method, first also servos as. a motivation for the neural-network approach, as the limited 
extensibility of this method is highlighted. 

2.3.1 Thruster Mapping by Exhaustive Search 

The first implementation, SEARCH, used an exhaustive search at each sample period to 
find the thruster pattern that minimizes the force-error vector [56]. Symmetries are used 
to reduce greatly the search space, enabling it to run in real time at a 60 Hz sample rate. 
This solution method does not scale well for a three-dimensional robot, or when thruster 
failures arc allowed, disrupting the symmetries. This provides the motivation for using a 
neural network: the neural network is used to learn and implement an approximation to the 
optimal solution - one that con be computed in real time.. 

The idea behind the exhaustive search, is that there arc A. finite number of possible 
thruster combinations (in this case, with eight bi-level thrusters, there are 2 rt = 256 com- 
binations), so the thruster mapper can evaluate each possible combination, and choose the 
one that minimizes the specified cost function. This process must be executed at every 
sample period, so tc> Speed up the process it is very helpful if the symmetries in the system 
can be exploited. 

Search Simplifl cation Using Geometric Symmetries 

If the thrusters are aLl the same strength (the nominal, configuration assumed in this 
example), firing two opposing thrusters (e.g. 7j and T t ) will produce no net thrust. To 
eliminate these useless combinations, the eight on-off thrusters, (7\, 7-j, ... , 7'«], may bo 
considered as four back wards-off- forwards thrusters [/£|, ilj, 7( 3l ft,], where, for example. 
Ri represents the reaction force resulting from T\ and T\. This reaction force representation 
can be used here to reduce the possibilities to 3'' = 81 . Now the robot is considered to have 
4 tri-level thrusters instead of 8 bi-lt'Vcl thrusters, Tins simplification is valid whenever two 
thrusters of c ourt! magnitude are directly opposing. 
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Figure 2.14: Possible Force Vectors with Eight Symmetrical Thrusters 
units Are in normalized thrust units . Jvach of the 05 circles represents a force 
vector that is achievable with the nominal configuration of eight on-off thrusters. A 
simplified version of the thruster mapping problem is to find which of these circles 
is closest to the desired forco vector. The problem h complicated by the additional 
dtsire s to save gas and to accommodate for failed thruSteis. 


The r.ext level of Simplification conies about; due to-tho redundancy in actuation ca- 
pability. Elimination of redundant combinations (e.g. fifing T\ and T 2 produces the exact 
same net force vector as firing T$ and TgJ reduced this number, to 05. Since redundant 
combinations occur duo to many thrusters having common strengths and regular positions, 
this simplification fails when these conditions arc not met, These G5 remaining available 
thrust vectors are plotted in Figure 2*14. 

Symmetries about the x - y, x - us and y - V' planes allow us to consider candidates 
in the first octant only, reducing the search space to 16. The final symmetry is about the 
? = y piano. This further reduces the number of candidate vectors to 1], resulting in the 
ll locations shown in Figure 2.15. 

The procedure to implement this symmetry* aided search is to hike the desired force 
vector and use the symmetries mentioned above to transform it into thp first half of the 
first octant in force space* r Vc u J). This is done hy taking the absolute value's 
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# possible force 
vector locations 
(normalized thrust units) 



figure 2.15: Possible Force Vectors after Symmetric Transformation 

Wit/a a// thrusters equal strength, and a geometrically symmetric layout , the 65 
candidate thrust vectors ctin he reduced to 22 through symmetric transformation. 
This simplifies the Search, allowing tt to run in real time ( this simiAificatAOn is not 
possible when thruster failure* occur). 


of the vector components, and swapping tho*x and y components if necessary* Then this 
vector is compared to o&ch of the. 11 prototypes, resulting in 11 costs (perhaps a, weighted 
cost function involving gas usage and force error), one for each- of the 11 candidates. TJio 
candidate corresponding to the minimum cost is selected as the optimum. The thruster 
pattern associated with this candidate is tltciuirauaformed to undo the.$ymnietri<: transfor- 
mations, bringing the force vector to the correct location in the full ihroe-componenf fore*? 
space. The resulting, thruster pattern is implemented on tlu: robot, 

Reduction of the search space from 256 candidates in ties general cate to X L in the fully 
symmetric case is critical i;o allowing tlu* thruster mapper to run in real time. The amount 
of computation required to transform th<» /\< CJ vector into this, half-octant, Search over 11 
vectors, and then transform the minimum* cost U vector back to the one corresponding to the 
full 3 -space* input, then produce the T vector* is significantly less th.in if th*w symmetries 
WciuigrorVd mid theusuardi included 250 patterns. 
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Difficulties! in Extending This Method 

PorA free-flying robot operating in three dimensions, the number of possible thruster combi- 
nations increases greatly (for example, with. 24 on-off thrusters, there am.2 2,1 = 1G, 777, 216 
Combinations). This is partially offset, due to the number of. symmetries also increasing 
(for a fully symmetric 3-D robot wit h 24 thrusters, there am 4ti9 combinations that would 
need t o be searched after complete symmetric reduction - a significant reduction, but still 
computationally demanding 3 ). 

Unfortunately, geometric symmetries may not exist, due to other spacecraft design con- 
straints, or due to unanticipated thruster failures. In this Case, the full number of thruster 
combinations would need to be searched to obtain the optimal solution. This situation 
motivates the use of a neural network for the thruster-mapping component: it is used to 
implement a nonlinear approximation to the optimal solution that can bo computed in real 
time. 

An alternative to developing a neural network to produce a function that approximates 
the result of the optimal search, is to use a sub-optimal search that can run in the time 
constraints imposed by the application. A simple example would be to limit the possible 
combinations to two t hrusters firing at a time. In this case, only 24-23/2 (2 t hrusters) + 24 
(1 thruster) + 1 (notliTustcrs) = 301 combinations would need to be searched at each sample 
period. While this may make the problem tractable, mapping performance will be reduced 
drastically. Other sub-optimal search schemes ma.y be developed that are more efficient, than 
this simple example. One possible scheme is presented by Sperduti and Stork in “A Rapid 
Graph-based Method for Arbitrary Transformation Invariant Pattern Classification” (53], 
This mat hod. was developed fo:r an Optical Character Recognition application, highlighting 
the fact that this control application is similar to a pattern classification problem. 

2.3.2 Direct 'Draining of a Neural-Network Thruster Mapper 

The search method described above defines the optimal solution to the thruster mapping 
problem. The next two methods are neural network approximations to this optimal solution. 
Since they are approximations, they will ho sub-optimal, but can be designed to run in real 
time. 

3 An algorithm to automate detiv.itionof.lhe symmetric transformation, functions Im* been (let eloped bju 
Kurt Zlfmncrrr.au and Brian Kemper at the Stanford Aerospace Holiotics Laboratory. 
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In the second method, DIRECT TRAINING, a neural network is trained to emulate 
the optimal mapping produced by the exhaustive search [71]. The network is repeatedly 
shown. several desired force vectors- along with the. optimal thruster pattern chosen by the 
search algorithm. The weights in the network are adapted using backpropagation to make 
the network outputs match those produced by the search algorithm (the optimal solution). 

This DIRECT TRAINING approach is useful primarily in that it allows the study of 
network architecture and topology issues before tackling the additional problems that Come 
with indirect learning. Hence it serves as a stepping stone to the goal of indirect learning. 

The approach also has potential advantages beyond that of an intermediate step. In 
particular, using a neural network as a function emulator may increase computational speed 
and system robustness very significantly due to the distributed, parallel nature of the com- 
putation. 

The investigation of the network topology issues associated with this DIRECT TRAIN- 
ING approach, led to the Fully Connected Architecture, presented In Section 3. The FCA 
can also he. used- with the indirect training method described below. 


2.3.3 Indirect Training of a Neural-Network Thruster Mapper 

Once the topology issues have been investigated during the direct training exercise, the 
network architecture can be chosen. The. topology of the network (i.e. t he number of neu- 
rons, and their interconnections) defines tlhc functional complexity capacity of. the network,, 
whether it is trained directly or indirectly.. With the architecture already selected to provide 
the required mapping accuracy, the next stop is to focus on the- training methods. 

In the third. method, INDIRECT TRAINING , a neural network is trained to find the 
optimal Solution when presented with a model of the plant, but no optimal teacher. This 
required back- propagation of error through the discrete- valued thrusters, which in turn 
motivated development of the noise injection method to be presented in Chapter 5. This 
structure, shown in the top part of Figure 2.1.3, reveals that the thruster mapper is forming 
an inverse of the thruster model. Us ing a neu ral network to Hearn a plant inverse, and using 
this inverse in the forward control loop, is a common approach for neural-network control. 
As will be discussed later, the presence of non-diffdrentiable hard limiters complicates the 
development of this inverse. 
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With, this form of training made possible, the neural network control system is a,ble 
to reconfigure itself quickly in response .to even drastic changes in thruster characteristics. 
There is no .longer a need to develop the search algorithm as an optimal teacher. . 

When evaluating mapping performance, the search method represents a lower bound, 
since .it defines the optimal, solution. Direct-training performance will be used as a bench- 
mark for comparison with indirect training, since it represents the lower bound defined by 
the finite mapping complexity available with the chosen network topology. 

2.4 Summary 

The control application chosen to study neural-network control is reconfigurable thruster 
control of a free-flying space robot prototype, a capability compelled by major failures 
in the robot’s thrusters. This chapter has described the experimental equipment used, 
the thruster mapping problem that is at the center of this, control application, and the 
approach taken towards solution of the thruster mapping problem (that includes the use 
of three separate solution methods in building towards the final implementation). The 
remainder of this thesis develops a complete solution to this control problem, &na presents 
advances in neural-network theory made to address this specific problem and the leather 
broad generic range of important tenl-world control problems that it represents. 


Chapter 3 


Control System Overview 


This chapter presents; ar. overview of the recorifiguTable control system developed for the ap- 
plication described in Chapter 2* This is a complex control system, involving the integration 
of several components. As mentioned in Chapter 1, often the most important, and some- 
times the most difficult aspects of a neural-network control application are the decisions 
about how to structure the control system and which components are to be neural- network- 
based* 

Specifically., the first issue is to determine whether the application ore where neural 
networks can contribute efficiently better (and cheaper) control than -is achievable without 
them. If they can, the second issue is to determine the optimal system architecture, (.hat is 
to determine in just which seftrneut(s) of the control s;y$t cm they should be used in order 
to do just that at minimal cost. This is the essence of astute hybrid control, a central 
Contribution of this research. 

In. addition to presenting the system-level control system design, the reasons for choos- 
ing this structure are given While this particular structure does noLxepresent a general 
architecture fo.r developing neural-network control systems, the new met hodology that led 
fo lluJLBtructure is general, and can be applied to the development of a wide variety of 
neural-network control systems and neural network applications: in general. 

While this chapter discusses the overall control system and design considerations, Chap- 
ters ‘1 and 5 provide in-depth dU-cursion cf the. specific neural-network issues encountered, 
and Chapter 6 provides ;i more detailed discussion, of each of the control-system components. 
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3.1 Control System Structure 

Figure 3. L shows the overall system block diagram. The additions made here beyond the 
control system presented in Figure 2.12 include a user interface and sin adaptive capability. 
These segments will be discussed, in detail in Chapter 6^ This chapter focusses on the 
system-level design considerations. 

The objective ;>s .to control the position and attitude of the robot base, while subject 
lo multiple, large, possibly-destabilizing changes in thruster characteristics. The plant is 
linear and well- modelled, except for the actuators, which ai'e on-off thrusters that could 
have altered characteristics. An {accurate vision system provides high- band width position 
feedback, which is then digitally filtered and differentiated to provide velocity. On-board 
accelerometers and an angular-rate sensor are used to provide base-acceleration measure- 
merits. 



Figure l\A: Recottfigurnble Control System - Block Diagram 

This cunt rot system is based upon h convention a/ indirect adaptive cofltroJ/er. sur/i 
as a aelf- tuning regulator, Examples of the continuous- valued F«p * vector and 
the corresponding discrete* valued T vector are shown, The ID block represents 
a rccilTsive-leasPsquaies identification of thruster strength and direction This 
continually-updated inod*rl in passed to tho neural network training block l shown 
in detail in Figure 5.6. The coiUiminlly-updnted neural thruster mapper i.< copied 
periodically into the active control loop 
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3. LI Control System Design Considerations 

Some control-system design considerations. for this application include) 

1. The robot is to be controlled by a human user at a. high level, so path plan/-ning/traj/- 
ect/-ory generation in required. 

2. The robot must reject disturbances and, at a low level, be robust to actuator and 
plant-model inaccuracies; so a robust feedback system is required. 

3. Casjrsage should be minimised where possible. 

4. High-performance control is desired. The requirements for a free-flying space robot 
arfc different from those for. a simple satellite control system. A robot Is expected 
to carry out multiple-degrec-of-freedom trajectory tracking with high control band- 
width. Satellites tend to spend their time regulating attitude to a fixed direction, 
or slowly slewing to a new direction. Satellite thruster-control systems are.therefore. 
usually designed for regulation performance and stability provability, at the expense 
of trajectory-following performance. For example, a satellite control system may look 
for the largest desired torque (roll, pitch, .or yaw), and enforce a one- dimensional 
bang-bang control law in that degree of freedom only (63]. 

5. A non -adaptive conventional cont rol system already exists. 

Temporal issues that influence the control design include: 

1. Control bandwidth is below 1 Hr. 

2. Acceptable robot- bate control peiformance can be obtained with a ,1 Hz thruster- 
update rate. 

3. Accelerometer bandwidth extends from 0 Hz to greater than 500 Hz. 

4. Extraneous Vibrations exist frotii 30 Hz and up. 

5. Thruster transient effects arc oft the order of 30 Hz and up. 

6. During reconfiguration in response to thruster failures, stabilization is required within 

15 seconds due to United table area 
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Items. 2-5 lead to. the selection Of a thruster- control update rate of 1 0 Hz, but.witli a 
sensor sample. rate.6f 200 Hz. Analog prefiltering, and digital filtering are performed.on 
this over- sampled, data to produce clean acceleration Signals. The time limit imposed by 
item 6 provides, sufficient time to begin, but not necessarily to finish building a model of 
the system. This .leads to a. design that has the. adaptation running concurrently with the 
identification *- there is not enough time to wait for the identification to converge. 

3.1.2 Indirect Adaptive Control System 

These system characteristics happen to fit well with a standard control structure known as 
“indirect adaptive control.*’ This refers to the use of sensor information to build a model 
of the System, and then to redesign A controller based upon the updated plant model. The 
“indirect” here refers to the intermediate step off building a model of the system. This is 
the structure; shown in Figure 3.1. 

The user issues desired- position commands to the robot via a graphical user interface. 
The current and desired position are used by a trajectory generator to calculate the path 
for the robot to follow, resulting in a trajectory vectOT, A"*,, consisting of positions and 
velocities in the three degrees of freedom at each sample time. This desired stain rector 
is input to a PD controller, along with the actual state vector, which is provided by the 
overhead vision system. The Proportional-Derivative controller can be used due to the 
simplicity of the plant, (this is .basically a l/.s 3 plant, so no integral control is needed [8]), 
and the availability of a high-fidelity velocity signal. The PD controller, output, F<j„, i;> sent 
to the.Thrttstcr Mapper, resulting; in the thruster pattern T. This T is then implemented 
cm the robot. 

Thu low-level portion of the control. system , consisting of the trajectory generator, ET* 
controller, thruster-mapper, and position sciisoc. is always running, and does not have 
adaptive capability. The adaptive system is highlighted in Figure 3.1, and consists of three 
components: sensors, ait identification process, and u-cofitfolier redesign process. The 
accelerometers and angular-rate nemsor produce a base acceleration- measurement vector. 
These signal:., along with the thruster firing signals, are used by the identification process 
to update a model of the robot's thruster characteristics. Tliis model is periodically sent 
to a control redesign profess that generates an updated thruster mapper based upon the 
updated robot model— This updated thruster mapper is periodically copied to l lie thruster 
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mapper running jn tlio control loop, as indicated by the double arrow. The control, identi- 
fication, .and xontroiterjfidesign loops argali running concurrently. Due to the possibility 
of a destabilizing failure, there js not enough time to wait to generate a new updated plant 
model before redesigning the. controller. 

So far, this structure makes no mention of neural networks. The factors Involved in the 
decision of where to us* neural networks are outlined below, In this application, a recursive 
least squares linear regression ID component was used, since identification of the thruster 
characteristics; is a linear process. The algorithm used to obtain acceleration measurements 
was nonlinear, but could be derived analytically, so no neural network was used there cithci . 
A neural network was used for the thruster-mapping component since it is an inscrutable 
nonlinear function that requires adaptation. The .control redesign process is therefore a 
back propagation- based neural-network training algorithm. 

The neural network is used precisely at the location where it is beneficial: the thruster 
mapper. If the robot were to remain perfectly symmetric, with no degradation, and it was 
restricted to in-thc-plane motions with 8 thrusters, the symmetry- assisted search would 
work well enough, and no neural network would bo required at. all.. In this application, the 
benefits of the neural network approach are required only if the symmetries are lost and 
adaptation is required. 

The selection, of th’S system architecture, and the following development of a neural- 
mHwork-based rcconfigurable ccntrol system present one specific example of a successful 
application of neural networks for control. However, the decision!) of how to structure 
the coni rol system, and where r,nd how t.o use the neural network are mare general: The 
lessons learned during the construt-tiou of this System may in fact bc’-applied lo-any can- 
didate neural- network control application. For. example, although this application used an 
indirect adaptive control structure, the methodology that follows is not restricted to this 
architecture, 

3.2 Gosit/Bciruifit Analysis 

To dot ermine whore lieuraJ network* rontributt* ofloctivvly, the control systems engineer 
must consider tin* strengths of iveural networks (nonlinear,. adaptive, generic, unstructured, 
parrtllcli zable) as Well ;u the costs* associated with lluiso benefits, (difficult to understand 
workings or prove M ability, dphigu is iterative, computationally c.omjdex), The ifost /benefit 
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balance must be evaluated on an application by application basis,. First at the System. level, 
the sysitem requirements, and considerations of degree-of-nonlinearity, adaptation require- 
ments, and computational complexity, etc., lead to a candidate System architecture. -Then 
at the component level, this cost/benefit analysts is repeated, leading, to the decision of 
what sort of subsystem will be used -in each segment, of t he control system. . 

Before evaluating < he applicability of neural networks for a control (or other) application, 
it is useful to examine, in mote detail, the specific costs and benefits of ne ural , networks , 
since these are what will be weighed in the design decision. 

3.2.1 Benefits of Neural Networks 

• Nonlinear - Since nOtifal networks tend to bo designed with on iterative gradient 
search, they can handle nonlinear internal and external (e.g. system to be controlled) 
components just as easily as linear ones. 

• General - The most common neural-network architecture, the multi-layer porreptron 
(feedforward network with sigmoidal activation functions) has been proven to be cA- 
p able, of representing any MIM.0 function to an arbitrary degree of accuracy. This 
was preseated by Hofflik et. al. in "Multilayer Feedforward networks are universal 
approximators" (10]. This generality is important when neural networks are devel- 
oped in software, but also for hardware implementation , where the ability to build 
multi-purpose I< *s is valuable. 

• Unstructured - Unlike, a linear mapping or Fourier transform, there is no pro-specified 
structure. to the computation a neural network can perform. The structure is devel- 
oped during training as the network parameters are set. defining tin? strength (or 

existence) of connections between neuioiis. 

« I'ariilh: livable - Neural networks are designed to In* iinpJethen'ed in. parallel hard- 
ware. In most applications, they aie developed in software, .and- implemented on 
tor ial-compn ting hardware, since dial presents a more convenient development ouvi- 
ronrneft. and. most .of the effort k, spent during the design and development phase. 
Hardware implementation then has the potential for vast urtprOVCmonti; in processing 
throughput. An additional -benefit of parallel hardware implementation is that the.. 
network i s tchuul la partial piorr<snr failure. For example, ill a spate application. 
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if cosmic rays were to destroy a few .of the. neurons, it is unlikely that the Output 
would be significantly affected, since the output is determined by contributions from 
thousands or millions of neurons- Additionally, the remaining neurons would be able 
to adapt to compensate for the damage, 

3.2*2 Costs Associated With the Use of Neural Networks 

* Black Box - Tins functionality of a neural network is defined by the connection 
strengths, i.e. a large number of parameters. This, coupled with the fact that .hey am 
nonlinear, means that it is difficult to understand what, they do. It may be possible to 
verify the network’s performance for a sufficiently large range of conditions, leading 
one to trust that the network wil? work well, but it is not easy to understand why the 
network does what it does (contrary to a simple linear controller, .where it in often 
possible* to study 1 he gains or poles and zeros to form an underst ailding of the fund ion 
of the controller, and perhaps why the automatic design process chose that function). 

4 Stability Proofs - Due to a neural network’s nonlinearity, and complicated structure, 
it is virtually impossible to develop rigorous stability proofs for it. This is a big 
concern for control systems that-put high demand. 1 , on stability, such as aircraft and 
spacecraft. One Way to ad dross 1 Iti.-s problem is to have a high-performance -neural 
network control system with a backup low-performance linear controller that has been 
proven .stable. If Inst ability were ever detected. -control authority would be switched 
to the low-performance system. 

* Iterative - Function- based iicnrc.1 networks, such ns -those. described in ibis thesis, 
are not calculated in oil? step but are developed through ail-iterative process known 
as. training of learning. This takes time.-and-since it is a nonlinear opt imitation, 
convergence. to a global minimum is hot guaranteed. Fortunately, the local minimum h 
rarely significantly worse than the global optimum. The FCA , presented hi Chapter 4, 
addresses both of these problems: by pig-programming in a linear solution, the initial 
training performance is as good as the best linear solution; also, starting the network 
close to a reasonably good solution makes it less likely that the optimization will 
terminate in an undesirable local minimmi. 

* Computational Complexity - The neural network may liav-s excess neurons or con- 
nections, thereby offering more functional complexity than is needed. This results 
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irt slower execution, and creates..a susceptibility to overfitting (poor generalization). 
Fortunately, many network pruning methods are_availahlc_th at eliminate the excess 
complexity, but this remains a. complicating issue, 

The specific costs and benefits Vary between different types of netsvorks, For example, 
memory-based networks do not: have the iterative cost mentioned above, as they .just store 
all sensor information artd recall the relevant information when needed. Also, Some neural 
networks may be better than others for a specific problem - for example, MLPs vs. RBFs, 
as discussed in Chapter 1. 

3.3 Criteria For Valuable Application of Neural Networks 

Study of these costs and benefits, the focussed (experimental) experience with the robot 
application, .and examination of oilier successful neural- network applications has led to the 
following summary. It. is. a concise .list of criteria for art application where use of neural 
networks will, be advantageous. The application should be: 

« Nonlinear - The powerful nonlinear capability of neural networks comes at the signifi- 
cant cost of computational complexity, slow convergence speed, and lack of prova bility. 
If no advantage will he obtained from this capability, it should be avoided. 

• Inscrutable - The fact that neural -networks provide a general nonlinear function-, 
approximation capability makes them particularly valuable for- problems where- the 
nonlinearity is inscrutable. If the exact form of nonlinearity is known (e g., sin, cos, 
quadratic function!., etc.) it should be used explicitly;, however, this may not be 
practical if the speed requirement calls for parallel hardware. For example, if 10,000 
*in(.r* + y^) operations are needed at a 1 MHz .update rat e, .parallel hardware is re.- 
q aired, and it may not be feasible to custom design ait Application-Specific Integrated 
Circuit (ASIC) for this application, where it may lie feasible to train a neural network 
chip to emulate this function. 

• (possibly) Requiring Adaptation - Since neural networks are generally trained itera- 

tively based upon some form of orror feedback, they are already set up for tttlaplntiiin 
to changes ill the plant or environ rtioilt—Thdrelnfe adaptive capability can be added 
with minimal effofL»enliiiiicing their applicability in adaptive control situations.. 
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or 

« Requiring parallel hardware (processing speed) - The availability of parallel neural 
network hardware, itt ay make aaieurahnetwotk approximation to even a. known non- 
linear function (for which parallel hardware does not exist and is not efficiently imple- 
mented using a microprocessor or programmable logic device) highly advantageous. 

An understanding of the alternative methods (statistics* linear adaptive control* etc.) 
is useful for determining whether the benefits a neural network can offer outweigh the 
costs for each application* It is common to see examples in the literature of neural network 
control systems used where a linear adaptive controller would have been easier to implement* 
and worked better. It is also common to see flawed justifications for neural control like 
"this is a difficult control problem that has not been solved using conventional methods, 
so we propose to use a neural network* (simply) because neural networks can do things 
conventional methods cannot.” 

Once it has been determined that the application can benefit from the use of neural 
networks* these same principles should be used to determine which segments of the overall 
control &y*ttnn are advantageously implemented with neural netwox!<$-and which are not. 
(This is the! essence of the optimal hybrid system concept.) 

In applying these principles to the robot control application* the conclusion is that 
a neural network will be beneficial. As mentioned in the previous section, the task is 
to. develop an approximation to the optimal thruster mapping* which can be calculated 
optimally* but is toe- complicated to run in roat time. This mapping is indeed both liighly„ 
nonlinear a id inscrutable, antidoes require adaptat ion injrcspon.se to changes in the thruster 
characteristics. Limitations. of the neura:! network approach for speed of reconfiguration, 
and training With the Oh-off thrusters* will he addressed with extensions to neural network 
theory in thus* a:rea:$, 
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Fully- Connected Architecture 


A number of issues are present in the thruster mapping Control application discussed in 
Chapter 2 that are common to many neural network Control problems. 

• Prior information about the system exists, and it should be possible to exploit this 
information when generating the neuraJ network. 

• Initial learning speed is important if the neural network will be trainecLon-lirte. 

• The neural* network topology (the number and connectivity of neurons) required to 
achieve an accurate mapping without over-fitting is unknown beforehand. 

• Some of the control outputs (thruster values) influence one another (e.g M directly 
opposing thrusters tlrou Id never fire together). 

The most relevant of the&c features for the robot control application .are tin* first- ami 
second ones.. Reconfiguring in response to a destabilizing thruster failure places a high 
premium on speed of adaptation. The architecture presented here allows immediate impk* 
mentation of a linear solution that is calculated using conventional methods. This provides 
a low-performance, h\\\ immediately- triable controller to use as a starting point in the opti- 
mization. 

It this chapter, a general neural-network architecture that addresses these issues is 
suggested. This “Fully-Connected Arcliitecture’ 1 is for feedforward neural networks that ran 
be trained using backpropagatioii (4fi) (ti()|, and refers to the structure shown in Figure 4.1. 
It was first presented, by Ucrbos (til’, and initially developed in a control Context by WiJsoii 
and Rock (71) 
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The extra. connectivity of this architecture, which is unavailable in a layered network, 
allows seamless integration of linear a priori solutions^ communication among input and 
output; neurons, and greater overall functionality than a layered network. The increase 
in parameters, can exacerbate over-fitting problems, and a systematic complexity-control % 

method is successfully demonstrated that lessens this problem. 



Inputs Outputs 


Figure 4. U Extra Connection:? Available with FCA 

This general feedforward architecture subsume# tv ore* Cam Hi nr single or double- 
hidden-layer architecture#. Here > the FCA is shouti to have hi! the connections 
of a singJe-ftidderWayer network, and some extras as well. The network 's neurons 
are considered to be ordered; beginning with the first input, ending with the last 
output , arid having hidden units in between, perhaps interspersed among input or 
output units. Note that there is no longer a concept of layers, B&ckpropngation re- 
stricts information how to one direction only ; so to get maximum interconnections , 
each neuron IaAcs -inputs from all lower-numbered nciitons and sends outputs to all 
higher- numbered nsuro/js. 
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4.1 Background. 


In the literature, the term “fully-connected feedforward neural network” usually refers. to 
a layered network, with an input layer, one or more. hidden, layers, and an output layer. 
“Feedforward” indicates that signals flow from the input layer, through hidden layers, and 
to the output layer in one direction only, which is required by the backpropagation algo- 
rithm. “Fully-connected” indicates that every input is connected to every neuron in the 
first hidden layer, and w on between successive layers. While this layered architecture may 
be particularly well suited for.many applications and certain .hardware hn piemen tat ions, a 
more general structure may be able to take advantage of the full capabilities offered by the 
backpropagation algorithm [46]. 

In this work, the term “fully-connected” will refer to the structure Shown at the bot- 
tom of Figure 4.1. Instead of layers, a fully-connected network can be considered to have 
neurons that are ordered, beginning with the first input, ending with the last output, and 
having hidden units in between, perhaps interspersed among input or output units [61]. 
Backpropagation restricts information flow to one direction only; so, again, to get maxi- 
mum interconnections, each neuron takes inputs from all lower-numbered neurons and sends 
outputs to all higher-numbered neurons. For example, the last output neuron takes inputs 
from all the hidden neurons, jufct a-s in a layered architecture; however, it now also takes 
inputs fronijeach of the input neurons and previous output neurons.— 

The main benefit, is not that it maximizes the connections-to-neurons ratio, but instead 
that, .when combined. with aiiystematic weight-pruning procedure, it allows a more flexible 
use of layering. There has been a recent trend in using not one but two hidden layers: the 
FCA fa a generalization of that .trend. 

In the application addressed in this work, the extra connections are found to be useful 
when coupled with a. procedure to control over-fitting. In particular, the I) x 4 matrix ift the 
upper right corner of the weight matrix shown in Figure 4.2 provides direct linear informa- 
tion flow from input to output (siginoids are used only for the outputs of hidden neurons), 
and the 3 x 3 upper-triangular matrix in the lower right corner provides communication 
between outputs. While these functions could be provided with processing components 
in. series or parallel-with the network, the fully- connected-architecture provides a Seamless 
integration of these capabilities. 
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4.2 Comparison with a Layered Network 

Figure 4.2 highlightsjhe benefits of the extra connections that are unused in a single- layered 
network. 



mt 

Outputs 


W(i,j) is weight connecting 
neuron i to neuron j 
Q no connection (W(i,j) * 0) 

B connections in 3-5-4 layered network 
□ additional connections with FCA 

(1) Feedthrough weights - direct, linear 

connection from input to output 

( 2 ) Flexibility- subsumes one, two, 

... hidden layered topologies 

( 3 ) Output crosstalk - communication 

among outputs 

@ Input crosstalk - communication 
among inputs 


Figure 4.2: Weight- Matrix Representation to Highlight Benefits of FCA 


* Feedthrough Weights: this segment, shown in region 1 in Figure 4.2, is a matrix that £ 

implements, a direct, linear connection from inputs to outputs (sigmoids arc used only 

on hidden units). This provides fast initial learning, and allows direct pre-pregramming 

of a linear solution calculated by Some other method. This is particularly important . 

for cont rol application::, where there is a large body of linear control knowledge that £ 

can be drawn upon to provide a good starting point. The FCA provides for seamless 

integration of linear and nonlinear components. 

• Flexibility: since the FCA subsumes any number of hidden layers, when combined f 

with a systematic weight-pruning procedure, the network topology (defined by the 

remaining connections) is set in a systematic manner based on gradient descent. The 
weights shown iti region 2 of Figure 4.2 represent the flexibility of the FCA, in that— 
the.conneCtioins may be configured to provide one and two -hidden layer, topologies (in ^ 

general, any feedforward network topology). 
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• Crosstalk among inputs and. outputs; these connections, shown h\ regions 3 and 4 of 
Figure 4.2 may be valuable, i.e. one output may excite Or inhibit another output, a 
feature unavailable v,uth layered networks. 

The “disadvantages” (i.e. issues which must he^addressed) of the FCA include; - 

• Increased complexity: number of weights increases quadratically with the number of 
hidden units, versus linearly for a layered architecture. The extra wei ghts increase 
susceptibility to over-fitting. 

• Slower hardware implementation: updating must be one neuron at a time, versus one 
layer at a time for layered networks. 

This general architecture makes full use of the backpropagation algorithm, while still 
allowing the use of mod ifi cations, such as the use of FIR connections in place of weights [57] 
or backprapagation through time [38]. Figure 4.1 shows the extra connections that are 
unused in a single-layered network. The question is whether the benefits of the enhanced 
functionality outweigh the increased computational load and susceptibility to over-fitting. 
This must be decided for each application. A more detailed description of each of thus- 
features of the FCA follows. . 

4.2.1 Feedthrough Weights 

For the robot control application, the most important aspect of these connections is that 
they provide a means for directly pro-prog ramming the network with ;t pre-calculated linear 
solution. Tills results in fast reaction to a. destabilizing thruster failure. Initializing the 
network to a good linear solution may result in a better final solution, as described below. 
Another benefit is that the feedthrough weights make it easy for the network to implement 
a jinear solution, so the FCA will work well when the. actual solution has a strong linear 
component (a common situation) superposed with a nonlinear correction. 

Motivation . .fusion of Prior Knowledge 

Much is already known about how to find linear approximate solutions to many problems, 
buth in control, and elsewhere, Often, the standard solution is a linear one, and. there are 
many highly advanced, very powerful, /mc«r design tools available. However, fer many real- 
world problems, there are significant mmUnearit.ies, and often the fallback procedure is tr> 
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use a linear controller designed for a linearized plant. Nonlinear design methods exist, Uut 
certainly not at the level, of linear ones. .One of the purported benefits of neural networks 
is that they address this problem with their adaptive, nonlinear approach [36]. 

Although a network often can be trained to solve a. problem starting with no prior in- 
formation, taking advantage of the (often abundant) linear theory can improve the learning 
rate and provide a better solution if properly presented to the network. 

Beginning the network .at a reaSona.bly-good starting point can lead to a better, final 
solution if it prevents the network from getting stuck in an unfavorable local minimum. 
This can alscdje useful as a learning guide. When Nguyen and Widrow trained the original 
truck backer upper [38], the initial learning runs were made with the truck pointed at, 
and a few* steps away from the loading dock. After mastering this easy task, the initial 
conditions were made progressively more difficult, leading the control system through a 
gradual learning process. Buckpropagation- through- time training for unstable systems like 
the truck can benefit greatly from some outside direction of the learning process. The 
teaching process used by Nguyen and Widrow., and linear initialization of an FCA network 
is another. 

In general, it is possible to use existing linear control theory to form a linear solut ion to a 
problem (possibly a linearized version of a nonlinear problem). In c;»ny cases, this solution 
will in fact be a reasonable solution to the full nonlinear problem. The feedthrough portion 
of !ie weight matrix offers a direct vehicle to import and implement this linear solution as 
part of the neural network. Similar alternative techniques to building in knowledge include — 
first training the (layered) network to emulate the linear solution, then adapting from there, 
or running; the linear solution in parallel with the network. One- benefit of the FCA approach 
is the scamlessness of the net work- linear solution integration - it immediately becomes part 
of the network. Adaptation to this portion of the network can be turned off, use the s»ame 
algorithm.. as the rent of the network, or use an adaptation algorithm based on linear theory. 






« 




Approximate! Linear Solution: Thruster Mapping Example 

A simple example of this situation exists- here: the exact solution to the thruster mapping 
problem is highly nonlinear and complex, but there is a linear approximate solution that may 
be easily calculated. The feedthrough weights of the fully- connected network architecture 
simplify, infusion of this a priori knowledge. 
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The (upriori linear solution used here was found by-assuming that the thrusters are 
capable of continuous- valued thrust, output (a linearized version of this problem). The 
solution is simply a 4 x 3 pseudounverse of the 3 x 4 matrix that maps reaction forces, R 
(the set of four [-1,0,11 thrusters), to base forces, F. Recognizing that the direct feedthrough 
segment of the fully-connected network provides exactly this computation (output = weight- 
matrix x input.), it is possible tcLincorporate this a priori knowledge by putting the pseudo- 
inverse linear solution directly into that sub-matrix, as an initial condition for the weight 
matrix. This linear mapper is then rounded off to the actual thrust positions possible at 
run time (-1,0,1). 

The problem is more complex if thruster failures are allowed, and the one-sidedness of the 
thrusters is considered. For example, the linear approximate solution may request negative 
thrust from a thruster, which is not physically possible (certainly in space, and practically 
elsewhere). The approach taken here is to find When negative thrusts are requested and 
attempt to reassign these thrusts to positively- valued thrusters. This is done exactly when 
two opposing thrusters exist, but is inexact when an opposing thruster does not exist for each 
thTuster. Since this provides Only the starting point for adaptation, it is not criticaJ that the 
linear approximate solution is optimal. A solution that considers one-sided continuously- 
valued thrusters is presented in (251. This was developed for the Gravity Probe B satellite, 
which is unique in having proportional thrusters, rather than on-off thrusters 1 . 

Approximate Linear Solution: General Case 

Alternatively, if a linear solution is expected to work well, but cannot be found through 
analysis, the network can find one adaptively. This involves zeroing alLweight s except the 
feedthrough ones, and using the standard backpropagation algorithm. A t tlihupoini,. with 
a linear problem, convergence will be very fast, as the cost function b parabolic (for direct, 
supervisory training). Thin increase in initial learning rate can be valuable for certain real- 
time applications., both on start-up, and after a significant change* in the system, where if 
is critical to find a stable, solution very rapidly. Once the system is stabilized (if this is 
possible with a linear controller), the rest of the network can be freed up to deal with the 
uonlinearities. . 

It is not necessafy to zero the reat of the weights when training the linear portion 
~ the linear weights initially leafii at a much greater rate than the others when all are 

l Thr satellite cdrriei liquid lidiuth iIiat. boil« f»!T?lim’y and Must bccxpdtad anyway _ 
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subjected 1.0 the same training algorithm. This effect is explained below, and can.be seen 
in Figure 4.3, which shows connection activation levels at various stages during training.. 

Ifimplemented on a serial processor, and speed is an issue, it may be useful to skip these 

extra computations during the initial learning phase, since they do not contribute much to # 

the network performance. 
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Figure 4.3: Network Connection Activity During Training 

M<$h plots show the magnitude of network connections (weights). A weight Matrix 
format is used, as itt Figure 4:2, Fully-counccted networks are in the top row Jay ere d 
networks in the bottom mw. First plot is a fter 25 epochs.- Second plot , top, is after, 
training with the feedthrough connections frozen io the linear solution, Second plot . 
bottom, is after training the layered network. Third plot, (op and bottom, are the 
/in a/ so/ntiortf (local minima) after all weights were allowed to adapt 




A weight must contribute significantly to the output before the faulting error signal will 4 

cause it to change significantly. If all weights are started simall, the feedthrough weights learn 
fastest, since the input arid output information provides an immediate error-gradient signal. 

Once theso signals build up, the crosstalk weights receive strong learning mgfials and begin 

to adapt. Starting all weights with an initial condition of zero will allow the feedthrough £ 

and crosstalk weights to adapt, hut all. other weights remain at zero throughout the learning 
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because. there- -is no error signal duo to these weights to stimulate learning. It is .common 
in network training to initialize these Weights to some random values. Choosing the initial, 
condition for the ^random” weights is a problem in itself. The method presented J>y Nguyen 
and Widrotv [£7[has proven to be valuable in this application. 

Once the feedthrough weights have been found, either analytically or. adaptively, they 
can be frozen or jdlowed to adapt, depending on the problem. In the case of an analytical 
solution, an adaptive algorithm distinct from backpropagalion may be appropriate. 


4,2.2 Valu6 of Cross-Talk Connections 


In addition to the value of linear feedthrough connections, the upper triangular matrices 
contribute by providing the capability for crosstalk among outputs and among inputs. These 
weights allow one output to excite or inhibit a higher-numbered output. As a dear example 
for the thruster mapping problem, if the network were to have an output (0,1) for each of 
the eight thrusters, and during training, a penalty was put on gas use, the network could 
use this segment to allow the firing of one thruster to prohibit the firing of the opposite 
thruster (which would provide 2ero net thrust and waste fuel). This is so clear that iu this, 
case, it could. perhaps mosbeasily be Implemented by manually programming these weights, 
although thejaetwork would eventually learn this at well. The example illustrates the value 
of crosstalk between input and output .neurons that is unavailable, in a layered network- 
Another example would-be the capability to select between redundant. output patterns: if 
[1 1 fl ! 0] and [0 0 11] both produce the same net force, they, may both be, equally likely 
to activate when that force is requested. This could result in either [14 11] or [0 0 0 0]. 
The crosstalk would allow the network tc> uf>e the first output to send it f o either of the 
acceptable solutions, and avoid the ambiguity. 

Crosstalk between idl outputs would bo nice, but backprop^gation limits us to uni- 
directional information. flow. This may make it important to select carefully tho ordering of 
inputs and outputs. If more complicated, nonlinear crosstalk is derired, extra neurons may 
be placed botweeruinilividual output or input, neurons. 
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4.2.3 Hidden-rNeuron Interconnections 


FCA Generalizes the Concept of Hidden Layers 


The FCA is .a generalization of the feedforward layered network. It therefore subsumes 
layered networks .with any number, of hidden layers, Le. it has all the functionality of a 
two or tli ref-layered network. This Can be seen in Figure .‘.4, which shows; how two and 
three-layered networks cart be represented by the FCA. 



Q no connection. 

(W(i j) = 0) 

| connections! in 

layered network 
Q additional connections 
in FCA network 


Figure 4.4: FCA Subsumes Any Feedforward Layered Network 

Toe FCA is shown to include (as subsets) all the connections available in two or 
three- layered networks. I.o general, it subsumes any feedforward network topology. 
The matrix representation here is Similar to that in Figure 4.2. 


Since the FCA .subsumes any number of hidden layers, when combined with a systematic 
weight .pruning procedure, tlie.net work topology (defined by the remaining connections) is. 
set in a.jysiewatic manner based on gradient descent. The weights shown in region 2 
of Figure 4.2 represent the flexibility of the FCA in that the connect ions may be config- 
ured to provide one- and tw n -liidden la Ver topologies (in general, any feedforward network 
topology). 

This flexibility is valuable, since often it is Act known n priori which network topology 
is best-suited for '.lie application. Coupled with a systematic network pruning method 
(presented below), the FCA allows for the network topology to he automatically chosen. 

On<2 Midden Layer or Two? 

The topology of a network can lmve a significant impact on the functional. capabilities of 
t,he neural network. It is gene rally accepted that at least one hidden layer is necessary to 









4.2. COMPARISON WITH A LAYERED NETWORK 


59 


perform mappings that are not Jinearlyjseparable. However, the decision to use one hidden 
layer, two, or more, .is an active area of research [1] [5] (?) [13]J19] [21] [26] [30] [35] [41] [49] 
[50] [55] [5&]. This sect ion present-some background in this, area.. There is no consensus 
among the researchers - the number of hidden layers needed appears to. vary from one 
application to another* Fortunately, .since .the FCA subsumes all layered networks, this 
issue is not so critical if the FCA is used with a systematic network pruning algorithm. 

In “On the Representation of Continuous Functions of Many Variables by Superposi 
tion of Continuous Functions of One Variable and Addition/ 1 A.N* Kolmogorov presents 
a mathematical proof regarding the functional complexity of neural networks. He show’s 
that a one- hidden-layer network with 2«-+ 1 hidden neurons (where n is the number-of 
inputs), can implement any continuous mapping from n inputs to m outputs [2$]. This U\ 
important, since it provides a mathematical foundation for tlie functional capabilities of 
neural networks, but there are two difficulties: (1) The nonlinear activation functions of 
each of the hidden neurons is not specified; (2) He docs noi show how to find, the weights 
or nonlinear functions. 

In. “Multilayer fcedforv;ard networks are universal approximators,” Kurt Hornik, Max- 
well Stinchcombe and Halbert White show that any function can be universally approxi- 
mated to arbitrary accuracy using a neural network witll only one hidden layer_[19j. This 
requires that the network has “sufficient'’ hidden units, but no method for determining the 
number of hidden units; is given. Additionally, there may he cases .where a network with 
more than one. hidden layer can implement the mapping more efficiently (using fewer neu-. 
rons and connections, although more layers). This is more applicable than Kolmogorov’s 
work, since the authors worked with standard sigmoidal nonlinear activation functions. 

In “Feedback stabilization using two- hidden-layer nets” [50], E.D. Softtag shows t hat- 
while singledtidden-layer networks may be sufficient to implement direct input-output map- 
pings, double* hidderi-layef -networks arc required (to guarantee that-H will. work in the 
general to implement one-sided inverses of continuous mappings. This is especially 
important in control problems, where it is cornmon to invert a plant model. This, is the cast? 
in the ihruuter mapping, where the thruster mapper is an inverse of the thrust cr-to-forcc; 
mapping defined by the thruster parameters. 

In “Why two hidden layers are better than one” [9], D.L Chester presents an example 
where a simple two- hidden-layer network is sufficient but- an infinite number of hidden 
neurons Would be required if a single hidden layer weie used, 
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In “Threshold circuits of bounded depth” [1 5], A. Hajnal .presents problems requiring an 
exponential number of nodes in a single- hiddend&ver network, but a polynomial number of, 
iiodes in a double* hidden-layer network* — 

As far .as using one or two hidden layers for-specific applications, different researchers 
have found Success with both architectures. In [21] and [30], networks with one hidden 
layer are found tcq>erform better than those with twa-hidden layers. Tn [26], [41], and [55], 
networks with two or more hidden layers are found to perform better. 

Since the decision to use one or two hidden layers is simply not an issue with the FCA, 
the lack of consensus on this issue is not. a major concern. 

4.2.4 Learning Performance: FCA vs. Layered 

Figure 4.5 compares learning histories (thruster mapping error on the training set) for the 
thruster mapping problem (with direct training) outlined in Chapter 2. Three networks are 
compared* each with l\ hidden. neurons. Each was trained-to emulate the optimal mapping 
(minimizing force error). Training a neural network is an iterative nonlinear optimization, 
and will usually produce a different result each time it is run, provided with a different initial 
condition. For this reason, results are presented as the average of several runs, each from 
a. different initial condition of the weights. In this plot, each curve in the figure represents 
the average performance for ten different sets of initial weights. 

This in the direct training problem mentioned, in Chapter 2, Even though indirect 
training is the ultimate objective, in order ta demonstrate the performance of the FCA, 
the direct -training problem is studied here first . Direct training is much simpler, while still 
containing all: of the architecture issues to he found in the indirect training problem. 

Looking at the initial learning performance, the FCA. network performs better than 
the layered network, due to the weight gradient beinginstantly available via. the direct 
connexion of inputs to outputs. As-cxppcted, the FCA. network.' with the a priori linear 
solution built in provide?, the best early performance. Although the randomly-initialized 
networks catch up fairly quickly hero, this initial head-start can be critical for a control 
application because it can mean the difference hetweeft stability and instability. This will 
be demonstrated later, in Chapter 0. 

In the middle region, between 100 and 1000 epochs, the layered network performance 
surpasses that of the FCA, due to the reduced number of parameters, mid simplified search 
space. However, after 1000 epochs, the greater functionality of the FCA network rouius into 
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Figure 4.5: Training Performance Comparison 

The fully-connected network (PC A) learns much faster at first, due to the linear 
connections . After the initial surge, the layered network posses it due to the re- 
duced number of parameters and. resulting faster learning, Towards the end, the- 
fully- connected network's performance is significantly better - highlighting itA extra 
capabilities. This is not surprising. Since the FCA network subsumes the function- 
ality of the layered natworl* The Network initialized uith thedinear solution begins 
with Significantly better performance 


play, and performance surpasses that of the layered network. This is of course expected 
since the FCA network ha:> all of the connections of the layered network in addition to the 
extra ones described earlier. The FCA- network with the n priori solution frozen in has 
slightly woise filial performance, since i he feedthrough weights are not adapted in this case, 
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4.3 Architecture Selection to Avoid Overfitting 

4.3.1 Overfitting 

Tim abbve lias shown the potential value of the extra connections, associated with a fully- 
connected neural network,, both in -faster, initial learning _and. in better final performance. 
However, the high number of parameters, while increasing functionality, makes the network 
susceptible to over -fitting. A layered network with i inputs, h hidden neurons, and o outputs 
has [i+o)h weights, while a fully-connected network has ((i + A+o)(i + h- o- 1)/2) weights, 
not counting bias weights. More parameters to adapt means the network will be slower to 
train, and possibly susceptible to overfitting. This is an important concern with the FCA, 
and must he addressed. 

A common method for evaluating the level of overfitting is to use a method known as 
"cross-validation.* In this method, a set of input -output data (known as the "test. set”) h 
kept separate from the set of data used for training the network (known aa the "training 
set”).. Periodically, the network’s performance on the test: set is evaluated. A decrease in 
test-set performance coupled with an increase in training-set performance indicates overfit- 
ting. At this point, the weights in the network have begun to adapt to the particulars of 
the training set (e.g, laoise or lack of sufficient, data), rather. than forming a generalization 
of the full population from which the training samples are chosen. 

Figure 4.6. shows how overfitting affects performance for different training set sizes. 
Overfitting becomes clear when the performance on the test set remains the same or wors- 
ens, while performance cm the training set .improves. -It is common that during training, 
performance on test and training sets will improve until a certain .point is reached when the 
network stops generalizing, ami begins to fit the particular data set. llsu of a "nufTiciently- 
large” training set can reduce over-fitting problems, but this may not be practical due to a 
lack of.da.ta, or to an adaptation speed requirement that needs a faster solution than this 
data-intensive brute-force approach. 

4.3.2 Systematic Complexity Control 

When training functionbased neural networks such as this FCA, the goal is to achieve good 
geiwrafteation by presenting the network with a large number of sample input patterns *Uong 
with the desired outputs. The hope is that the parameters that define the functionality of 
the network will adapt to fit this training, data* and will then respond correctly when 









4.3. ARCHITECTURE SELECTION TO AVOID OVERFITTING 


63 



Figure 4.6: Training History, Performance on Test, and Training Data 

Overfitting, is seen by the divergence on training jind test performance, w more of 
a problem for small training sets. 


presented with, new input patterns. The danger of overfil ting arises when the net work has 
an excess of parameters to fit: the danger is that these parameters will be used to fit the 
noise in the data and lead to poor generalisation. 

It is generally accepted that the fewer parameters used in the model, the less chance of 
excesa functionality being used to fit noise, resulting in better generalization. The task now 
is to find out which connections are requited to implement the desired mapping, and build 
a network using only those weights. The network architecture selection could be performed 
manually, but this would not be practical, For. this problem, a. network with feedthrough 
connections, weights corresponding to a layered network with. five hidden neurons, and the 
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crosstalk, connections on the outputs, would probably work well and not bo susceptible to 
overfitting, given a good training set. 

This heuristic approach may overlook some valuable extra connections, and may still 
result in. overfitting, bo a systematic network-pruning technique is desired. One that was 
found to be successful involves a modification of the cost function that the neural network is 
trained to minimize. This approach was first proposed by Rumelhart and Weigend [44] [59]. 

The cost function is augmented with a term that, places a cont on the Complexity of the 
neural network (complexity is. defined by a mathematical function of the weight values). 
The neural network is then trained to minimize this new cost function, using the same 
gradient-based optimization methods as before. 

This complexity-control structure is based on the following assumptions: 

1. The best generalization is the least-complex one tha/t still performs an input-output 
mapping with an acceptable error. Therefore, there i& a user-defined, parameter to 
determine this balance between complexity and mapping performance. 

2. The complexity of a mapping is related to the number of connections between neurons. 
Therefore, the cost associated with each connection is zero when the connection is zero, 
monotomeally increases i is the weight magnitude increases, then, plateaus al; a large 
weight level. This way, the total complexity. cost varies with the number of non-zero 
weights., rather than with the size of the weights. The relatively-smaH weights will be 
reduced towards zero, leaving the larger (amd supposedly useful), ones unrestrained. 

The complexity-control term is shown in Equation 4.1, and presented graphically in 
Figure 4.7. 
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J comphxily * X, X 
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Figure 4.7: Complexity-Cost Function 

Weights having values near zero cost little . Weights with high values (indicating 
that they contribute significantly to the network’s function) cost A, tut the £radient 
is. small, so there is little incentive to decrease them. Weights near the inflection 
point are small (they do not significantly affect network performance). The slope 
here is /ugliest, so the network has the most to gain by decreasing them. 


N - total number of. neurons 

Wij = weight denoting the connection Strength from neuron, i to neuron j 
Wo = weight normalization parameter 

( 4 / 2 ) 

Selecting the scale factor effectively sets the cutoff point for weights - it. determines where 
the inflection point of the complexity cost function occurs. This defines the transition from 
a nearly-parabolic (for w « ui 0 ) cost surface to one that asymptotically approaches (for 
to » u»o) a. flat surface (he., with zero gradient). For w << ti>o, weights axe very-Strcngly 
driven to. zero, whereas for w >> Wq, the gradient is near 2ero, and weights are not restricted 
significantly. 
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Selecting a high wjll .result in a nearly-parabolic cost function that keeps aU weights 

from growing too large. In t:he parabolic section, the gradient acting against each weight is 

roughly proportional to the magnitude of J, hat weight. . 

Selecting a low wq will have the effect of shutting off completely some of the weights, 
while not affecting the others. This parameter is selected iteratively by the user. 

The complexity-control term is added to the total cost function, svitlua.jweigli.ting pa- 
rameter, A, as follows in Equation 4.3. 


J total 3 JpfirformaT'.ce 4’ A Jcstnytexity 


(4.3) 


where., 




J total 
J per j ormanct 
Jcomplen ty 

A 


total cost function to be reduced by gradient- based optimization 
network-performance cost function, such as shown in Equation 2.5 
nctwork*complexity cost function, shown in Equation 4.1 
complexity-cost weighting parameter 

(4.4) 


The weighting parameter, A, is set by the user on an application-by-application basis 
to- achieve the desired balance between performance optimization (e,g., thrusterrmapping 
performance) and complexity minimization (i.e.,.to reduce oveTfitting problems.). The pa- 
rameter can be adjusted iteratively by observing; performance on test and training data 
sets.-such as chose nhotvn in Figure 4.8. 

Equations 2.5 4.1 and 4.3 are combined,. resulting in the total cost function shown in 
Equation 4.5. 



where. 
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Jtotal 

K,„(V 

^torr(T) 

r« t ,r(T) 
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total cost function to-be reduced by gradient-based optimization 
net force error in x-dir6ction, (E x<ti , —F Xtt( ), resulting from T 
net force error in y-d.irection, (F yjci - F Viit )> resulting from T 
net torque error~about V^axis, - r Vac( ), resulting from T 
normalizing factor for F t ,„ and F y , rr « force per nominal thruster 
normalizing factor for r^„. r , torque per nominal thruster 
binary thruster values, [ T x J 2 T 3 T A T s T 6 T? T s ] 
thruster number . . 

complexity -cost weighting parameter 
number of neuron where connection originates 
number of neuron where connection terminates 
total number of neurons 

weight denoting the connection strength from neuron i to neuron j 
weight normalization parameter 


(• 1 . 0 ) 

The benefits of this method may be seen in the training histories shown in Figure 4.8. 
The network had five hidden neurons; and without any Sort of complexity reduction, overfit- 
ting is clearly a problem, given the reduced training set. With the addition of the complexity 
term, overfitting was controlled, resulting in comparable performance on test and training 
sets. 

The complexity control function and training histories for a futly-comieclod network 
with & hidden neurons are plotted in Figure 4.8. Without complexity control, over fitting 
becomes clear at around the 40Cl0th epoch, as the performance on the test set worsens, 
while performance on the training set improves. With the addition of the complexity term, 
over-fitting is controlled, as performance histories on test and training sets no longer diverge 


4.11.3 Other Complexity-Control Methods 

Many systematic network-pruning techniques have been proposed and used suer cssfn l!y in 
ceriain applications. For,. example “weight decay" uses a cost function like A( ) to try to 
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Figure 4.8: Complexity Cost Reduces Overfitting % 

In f he case with no complexity control, the divergence of network performance on 
the test and training sets indicates overfitting. Addition of a complexity cost term ... 
is successful in controlling over tit ting. Although training performance is worsened, 
performance on the test, set is improved, which is of course the desired outcome. 


redti'-e all the weights [39] [-10]. Other methods completely eliminate connections or neurons 
in an iterative process [48] or with a genetic algorithm [65]. A survey of pruning methods h 
presented iu [42]. The fuel hod used here has been shown to be effective in this application, 
but other Methods may work as. well or better at improving generalization performance. • 

-1.3.4 Automatic-Growing; of the NetWork 

The above-mentioned complexity control method works by selecting a network topology. # 

and diet trimming lice exc ess eonnertions to achieve the desired complexity. An alternative 
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is to. begin wil.h a small network, .and add neurons to achieve the desired functionality. One 
advantage of. growing a network is the potential for an increase in learning speed. .With 
fewer hidden neurons, very quick learning takes place since fewer Computations are required 
(this would, not be true for .a parallel hardware implementation,, but is-true for the more 
common serial implementation). Addilionalv, fewer training_pattcrns are required. (to avoid 
overfitting), further reducing the. number of computations. required during training. 

Growing the network is not a new concept, it is, similar to the Cascade Correlation net- 
work proposed by Scott. Fahlraan, in which the network is grown one neuron at a time (11). 
This has, been found to have benefits beyond the reduction in required computation: re- 
duction of the “moving-target” problem 2 , and reduction of susceptibility to getting stuck 
in local minima. The network adapts until performance asymptotically approaches an opti- 
mum; then a neuron is added. These extra degrees of freedom are often sufficient to break 
the network out of a. local minimum. In Cascade Correlation, the previous hidden neuron 
weights are frozen while the weights for the new neuron are adapted. This simplification 
of the search space reduces the moving-target problem. It can reduce computation if batch 
training is used, s.nd the previously-calculated neuron activations are stored iu memory. 

In-the real-time implementation required for the robot-control application, the network 
is grown automatically. Beginning with a small number of hidden neurons and a small 
training set, the initial learning rate is high. As network performance plateaus (measured by 
a sustained cessation of improvement in test set performance), hidden neurons are added, a 
small batch at a time. As tile number of hidden neurons increases, the network performance 
approaches optimality, but at tht: expense of slower training. This approach fits well with, 
ll" control application, since rapid. stabilization and coarse optimization are important, 
while rapid attaint!. jot of ttenr-optitual control is not so critical. 


4.4 Summary of Implementation Issues; for the FCA 

Thu above has outlined the features of the now FCA developed in the present research, and 
of a number of issues in using it effectively. The specific use of complexity control, network 
growing, and the extra connections offered by the FCA, will vary from one application to 
another. The implementation issues for the robot-control application are outlined here. 

, Ttu» refers to the wtights changing directions and had;- tracking Ui/rmgUul the training while (lie 
network approaches a find solution. While this is not ficossatil} l>*d, it can slow down learning 
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• The F.CA's feedthrough weights are the most important feature, as they. provide 
near-immediate stabilization- Some implementation requirements, such as the use 
of parailel hardware, or the use of software optimised for vector-processing on a serial 
computer,. can. place a. high cost on the use of hidden layer interconnections. In such 
a case, these connections may need to be eliminated. 

• The use of automatic growing has been found to produce a significant improvement in 
initial leaming.ratc. Since tiie added coding requirements are minimal, this technique 
should be used whenever thereisn requirement for fast initial learning. 

• Complexity control is simple to implement, and has been shown to reduce overfitting 
problems, so its use is recommended. 

Many modifications to back propagation that claim to improve learning speed have been 
proposed in the literature. Backpropagation is an algorithm for efficiently calculating the 
derivatives of the weights with respect to a cost function in a neural network. Once this 
gradient estimate is obtained, any of the several existing gradient-based, optimization meth- 
ods may be used. Some algorithms specific to neural networks have been. developed that 
attempt to take advantage of some features specific to neural- network optimization. 

The simplest implementation multiplies .this gradient estimate by a fixed parameter 
to calculate the change in weights. More complex implementations adapt this learning 
rate parameter, or add a “mome/ituml term that Sums past gradient estimates to filter 
out high- frequency noise and integrate low frequency trends. -Several other methods, such 
a; conjugate gradient, Levenberg- Marquard t , Quickprop, and other second-order gradient 
optimisation schemes have proven Successful in certain applications. However, the benefits 
of each algorithm appear to be somewhat applicat ion-dependent. 

for the robot-control application, batch-learning, adaptation. of. the learning rate (in. 
this case, -a. matrix of independent learning parameters is used. - adapted independently 
lor each weight), and use of rticmiriitum ate used to accelerate learning. For the thruster 
mapping application, this combination of enhancements to backpropagation has been found 
to pruville the best trade-off between simplicity of implementation and rate of adaptation. 


« 


Chapter 5 


Gradient-Based Optimization for 
Discrete Systems 


The previous chapter dealt with direct. training! and led to the development of the neural- 
network architecture used to implement the thruster mapping. To allow- indirect training, 
where the learning signal (error) is generated based on the robot-modeloutput (rather than 
on an Optimal teacher), the error must be backpropagated though the robot model. The 
discontinuity introduced by the use of the robot’s on-oiT thrusters presents a significant 
obstacle, and makes absolutely necessary the development of the- new training method 
presented here. The solution to.this problem is, in turn, a general algorithm for performing 
gradient-based optimization for systems with discrete- valued functions. 

The discrete- valued functions did not cause a problem for. direct training, since in 
that case the discrete values, are supplied as the. output patterns in the training set_(e.g. 
fl.87--0.76 0.11] gets mapped to [0 0 l 1 1.0 0 0)). The fact that the target outp.utS are 
restricted to l’s or 0’t> docs not affect the training. However, for indirect training, the on-off 
actuators are represented by discrete. valued futu'tioni that are used as a forward model in 
the ba.ckpro;i>agation training. 

In this chapter, a flew technique for bat k propagation learning for system's witli discrete- 
valued functions- is presented. It is applied to the on-off thruster control problem de- 
scribed in Section 2, as well as to the gen t ic problem of.tjjnjfjing niuitijayet signuiii net- 
works [6flj [70] [73], 
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5.1 .. Problem Statement 

Optimization. methods that use gradient information.often converge much faster than.those 
that do not. Use of the bacltpropagalion algorithm [46] [60] to get this gradient information 
for training neural networks has made them useful in many applications; however, back- 
propagation’s requirement of continuous differentiability, not only for the network itself, but 
for anything through which the error is backpropagated (e.g. the plant model in a control 
problem), limits its applicability. 

This is a significant limitation since there arc many applications where discrete-valued 
states arise. For example; on-off thrusters commonly used in spacecraft (the example 
used in this research); other systems with discrete-valued inputs and outputs; and neural 
networks built with signums (also known as hard-limiters or Heaviside step functions) rather 
than sigmoids. Signum networks may be preferred to sigmoidal ones duo to hardware 
considerations. 

In cases like these, one choice is to use an alternative method not restricted to con- 
tinuously-difl'ercntiablc functions, such as unsupervised learning, simulated annealing, or a 
genetic algorithm; but these are usually significantly slower to train, because they do not 
use gradient information. 

Also, it is common for a problem to be well-suited for gradient- based optimization, ex- 
cept for the presence of discrete- valued functions. The neural-network thruster mapping is 
a prime example: a neural network (differentiable) produces an output that is discretized 
('with a non-differentiable function) and then passed through a model of the robot-thruster 
system (differentiable) before the performance can be evaluated and used for training. Ex- 
cept for the DVF, this problem is .welt-suited for gradient- based optimization. Rather than 
go to a completely different solution strategy, it is desirable to introduce a modification 

to gradient descent that .will accommodate the non-differentiable functions. This soft of a 

situation is rather common when DVFs are involved - the DYFs often represent n small 
portion of the overall system, but the problem t hey present for gradient -based optimization 
is formidable. 




5.2 Related Research 

This problem is related to a iimilaf problem that lias received some attention in the field of 
neural networks: training multi layered networks of Imrd-liiiiUiilg.notifons, The algorithm 
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presented here. will be shown to . be~a.pplicable to this problem. This section presents a 
historical background and related research .directed towards training signum networks. 

5.2.1 History of Neural-Network Training With Smooth Activation Func- 
tions 

Before the task of training a network built with DVFs is examined, it is useful to consider 
the history of the feedforward neural network, and why the sigmoid function was chosen in 
the first place. 

Learning algorithms for single-layer signum networks date back to I960, with Widrow's 
ADAL1NE (ADAptive Llmear NEuron) (67] and Rosenblatt's Perce; ron [43]. Unfortu- 
nately. neither of these methods extends directly to multiple layers. Minsky’s proof of 
the functional limitations of single-layer Perceptions [32] [33] combined with this lack of a 
learning algorithm contributed to a reduction in interest in neural networks in the 1070s. 

In 1974, Wcrbos [60J presented the backpropagation algorithm for the first time. While 
the mathematics of the algorithm may be tTaced back to work in 1969 by Bryson on optimal 
control [6], Werbos developed the algorithm for a number of applications, including neural 
networks built with sigmoidal activation functions in the hidden layer. Unfortunately, this 
work was largely unnoticed until its rediscovery and publication by Rumelhart in 1986 [46]. 
Tire key extension that allowed training of networks with hidden layers. was the replacement 
of the signum with the sigmoid. This allowed Bryson’s work with multistage opt imization for 
dynamic systems to be applied to gradient-based optimization with the now? differentiable 
neurons. It is understood that use of a sigmoid in place of 4 signum is computationally _ 
more expensive, without providing significant added functional complexity; however,. the 
use of a function that is c.ontimjously -differentiable allows for the application of the efficient 
gradient -based optimization methods developed by Bryson. 


5.2.2 Neural-Network Training With Discrete-Valued Activation Func- 
tion:) 

M AEtAl.INE ( Many ADAptive Linear MEurons) Rule I was o two -layer network (one hidden 
layer) that had a trainable lira) layer, but the second layer was. a fixed logic operation, 
such as OR, AND, or MAJ (majority) [18j. In MADAHNE Rule II. Winter (74) used 
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a heuristic approach which had limited success, at training, a twodayer network of hard- 

limiters (AD ALINEs). These methods may be classified as “error-correction rules” rather 
than “steepest-descent rules” (gradient-based) [67]. 

In recent research aimed at using gradient-based learning for multi-layer Signum net- 
works, Bartlett and Dow.ns^J use weights.that are randonu'ariables, and deyelop.a training 
algorithm based on the fact that the resulting probability distribution is continuously dif- 
ferentiable. The algorithm is limited to one hidden layer, requires all inputs to be 1 or -1. 

and needs extra computation to estimate the gradient. 

Another method is to approximate the discrete-valued functions with linear functions or. 
smooth sigmoids during the learning phase, and switch to the true discontinuous functions 
at run-time. This is similar to the original ADALINE, where the neuron was trained on its 
linear output, but in operation, this output passed, through a signum function [67]. This 
method may work in cases where the behavior of the system with sigmoids is close enough 
to that of the real system; however, this assumption is very often unreliable. 

5.3 A New Training Algorithm: Approximation With Noi- 
sy Sigmoids 

The method of noise injection is introduced by applying it to the training of a single hard- 
limiting neuron, as shown in Figure 5.1. Although this neuron cculd be trained with the 
ADALINE or perceptron learning rules, those methods, do not extend to multiple layers. 

The method presented here does ;aot have this significant restriction. 

The first block diagram in Figure 5.1 shows the neuron as it appears at run time: a dot 
product with hard limiter. For simplicity in bookkeeping, the input, A, and weight, IF,. - 
vectors are augmented to include the threshold hi as Tor the output function. The next two 
diagrams show. the neuron during training, where the signum lias been replaced by a smooth .. 
sigmoid function. The input,, A’, is propagated through the forward sweep, finally resulting 
in an error, and a cost. The derivative of this cost in calculated and propagated though 
the backward sweep, resulting in a dcost/dX to be propagated to more units, upstream, 
and a tfcost/flnet to be used in calculating dcmt/d\V, which is used in the weight-update _ 
algorithm. 

This is almost the same as training, a standard neuron with backpropagalion; the only 
difference involves the injection of aero-mean noise, N, immediately before the sigmoid. 
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Figure 5.1: Training Algorithm 

During training, replace discontinuous signums with sigmoids, and inject noise be- 
fore the sigmoid on the forward sweep. The backward sweep calculation is the same 
as standard bnckpropagation, 


While the mechanics; of’ the backward sweep are identical, different weight updates result 
because the forward s* eep resulted in a different error. 

Note that ibe-in. oe injection does not corrupt the calculation of dcost/OW (just as the 
desired signal does not). Using an unmodified backward sweep is not only the simplest, 
thing to do. it does precisely the right calculations for estimating the weight gradient. 

What makes this method useful is that as the noise level increases to cover the sigmoid's 
transition region, adaptation wUlutha resulting dcottH/SW leads to a set of weights that 
work well for the Sigmttn network.. 

To Summarize, the training algorithm Is: 

« Replace the hard-limiters with sigmoids d uring I raining 

• Inject noise immediately before the sigmoids on the forward sweep 

« Use the exact name backward sweep as with standard baekpropagation 


5.4 Intuitive Explanation 

Without addition of liaise, the network may train using sigmoid out put values ir. the sigmoid 
transition region (roughly -0.S to O.fc’0 tltat will be unavailable at run time. Simply rounding, 
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affat runtime .may introduce significant errors, For example, in a hypothetical xost surface, 
a value of 0.4 may be Optimal, but if forced to choose between -1 and 1, a value of -1 may 
be better- 

The problem is much more apparent when the DVf outputs are recombined, such as 
with the output layer of.a network built with hidden signum units. This also, occurs when 
the robot-thruster physical parameters combine to. produce a three-element force vector 
based upon the binary eight-element thruster vector. 

The goal of noise injection is to move neuron activations away from the transition region, 
so that roundoff error will be small when the discrete-valued functions are replaced, for 
this reason, the standard deviation of the noise is chosen to be higher than the width of the 
transition region of the sigmoid. 

Figure 5.2 shows how the neuron output distribution changes as the noise level increases: 
with no noise, only a single output can result; but as noise increases to cover most of 
the transition region, the output distribution approaches that of a hard-limiting function. 
Differentiability is maintained, however, so that gradient information will be available to 
speed up learning. Since the noise has pushed the distribution to approximate a hard- 
limiting nonlinearity, when the hard- limiter is re-introduced at run-time the performance 
degradation will be small. 

5,5 Application Considerations, Extensions 

5.5.1 Selection of Noise Level 

One. concern. is the attenuating effect of the .derivalive-of-sigmnid function. When bach- 
propagated through many layers oflnear-saturated sigmoids, the error signal is attenuated 
and may. lead to slow learning. To handle this problem, it may, be necessary to bo gradual in 
increasing the noise level; slowly push the outputs from the linear region to the hard-limits, 
rather than all at once. However, since all the experiments presented here had a single layer, 
of discontinuity, no such gradual iflrrvaso was required. 

For training networks With simple bi-level sigmoids,. once the noise reached a sufficient 
level (.roughly 0.5 and 3 in two different applications), there was no degradation if it were 
increased beyond that level. The only possible drawback is the attenuation effect mentioned 
above. The required noise level varies in different applications depending upon hmv sharp 
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Figure 5.2: Effect of Noise Level on Sigmoid Output Distribution 

Lightly-shaded region in column } represents the sigmoid input probability distri- 
bution (in this case, -0.3 + uniformly distributed noise/ Darkly-shaded region 
irt column 5 hs the sigmoid output distribution (from -1 to 1), Each disiribution 
has. an area of L Input and .Output are plotted together in column 2 to show 
how the sigmoid produces this input-output relationship. As noise level increases, 
and the input distribution spreads out t the sigmoid output approaches that of a 
hard-limiter, while remainin g diftaentiulde. _ 


the decision boundaries would be with no noise (i.e. if it's a sharp. sigmoid JiLVlSRFMrta 
not much lioiso is needed to force It ofi the transition region). 

When multilevel sigrnoids ale used, as .‘seen in Figure 5.8, theft? is an upper limit to 
the noise level: too much noise may cause the individual sigmoids to overlap, which in this 
example would blur out the middle level. The specific level of noise at which this effect 
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begins depends upon the sharpness of the sigmoids and the discrete values approximated. 

In Figure 5.8, with a sharpness factor of 4 (slope. at midpoint ~ 4) and One unit between 

discrete levels (-1,0,1), this effect begins around N j= 0.2 and is significant at around 

A* = 0-3. These values could have been predicted by sketching th _ limits of the noise-altered 0 

function (the shaded region in Figure 5.8) end determining at what point the middle region 

{input = 0 =*• output ■= 0 ) becomes affected by the noise. 

The key idea in this algorithm is that the network performance error is linked to round- 
off error due to use of the sigmoid transition region. The goal of the noise injection is to 4 

discourage use of this transition region. Therefore, whether use is discouraged using Gaus- 
sian noise, uniform noise (used here), fixed-level noise, or additive penalty functions, the 
effect is qualitatively the same. 


5.5.2 Discrete- Valued! Functions Other Than Bi-Level Signums 

If adapting a. system that contains discrete-valued functions that are not simple Heaviside 
step functions, the method may work if a. continuously differentiable approximating function 
is used. For example, a function whose output can take on multiple discrete values may be 
approximated by combining multiple sigmoid functions. For the thruster mapping problem 
described in Section 4, the thruster can take on three states: forward, off, or backward. 
Two bi-level (-1,1) sigmoids were summed to produce a tri-level (-1,0,1) sigmoid. 

In fact., the sigraoid-based approximation may be developed through a supervised train- 
ing technique using standard backpropagation. The limitation introduced by the atten- 
uation of. error signals is -again a factor, and must be considered when developing the 
smooth approximating function. This can be done by limiting the-sliarpness of the sig- 
irtoids if programming by hand. If training the approximating function, adding a Complexity 
cost [ 59 ] [71] will keep the weights small, and will systematically limit the sharpness. 

5.5.3 Batch Learning 

The random ness. iutrodu cedi with the addition of noise could make learning slow because 
of the reduction in signal-to-noise ratio in the weight gradient estimation. Batch-learning, 
using the exact Same training Set from one epoch to the next worked well (considering 
the “training set” to include the “input set” and “noise Set”). Freezing the training set 
and noise set defines a fixed deterministic cost hyper-surface. With a fixed cost function, 
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On-line tuning of momentum and learning rate can be applied to improve dramatically, the 
convergence rate. 

5.5.4 Optimization of Discrete- Valued Parameters 

Another area where this method hits potential Is for optimization problems that have dis- 
crete valued parameters. For example, a design optimization problem where the task is to 
select the right DC motor, pipe diameter, or gear ratio from a finite set of discrete- valued 
options. It is expected that this method will extend well to this family of problems [31]. 


5.6 Application to Training Multi-Layer Signum Networks 

In this section, this method is shown to extend to multiple lasers of hard* limiting units with 
no modification. Figure 5.3 summarizes the method: during training, replace each hard- 
limiter with a sigmoid and zero-mean independent noise source. Note that the sharpness 
of the sigmoids does not matter at all here (except foi numerical considerations), since the 
sharpness factor simply multiplies the weights, and the weights are adapted. 




Figure 5.3; A Multi-Layer lignum Netwcsrk, Seen at Run-Timt; and During Train- 
ing 


In the first test, an-adaptive 3-5-4 signum network is trained to emulate the input- 
output mapping defined by an dependent, fixed, 3-10-4 sigmoidal network. Fewer 
hidden neurons are used in the adaptive network to ensure that overfitting will not Introduce 
unnecessary complications. The 3 - 10 - 4 network's fixed weights were randomly chosen 
between *2 and 2. 
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Network Performance 




Figure 5.4: Training with Noisy Stgmoids of a Multi-Layer Signum Network, 
Artificial Training Set 

Left: with higher norse levels, performance on the noisy sigmoidal network ap- 
proaches that of the signum network, indicating that the noisy sigmoid is a valid 
(and differentiable 1 ) approximation foe the signum. Right: As noise increases, the 
network adapts to sharpen its sigmoids, causing the first layer weights f.o increase, 
and the sigmoid output distributions (o approach hard-limiters. Activation distri- 
butions were collected over the whole it raining set, with no noise added. 


Performance is shown in Figure 5.4, Each dot on the graph represents the final perfor- 
mance after a full training Tun (10,000 epochs or until a local minimum was reached). Seven 
values far noise level were chosen, and ten different network initial conditions were used at # 

each noise value. With no noise,. performance is good for the sigmoidal network, but when, 
the signuins are reintroduced at run-time, the error increases dramatically. One point is- off 
the gra.ph at an error of over 6 units. As noise increases, performance on the sigmoid network 
decreases, as expected, but the signu ui-net work- performance improves, and approaches the § 

sigmoid-network- perform ante. The weight magnitude and neuron activation distribution 
plots confirm that as noise increases;, the noisy sigmoids behave like hard-limiters. Note 
that thes* activation distributions could not have been achieved by manually increasing the 
sharpness of the sigmoids: this would have had 2tro net effect,. since the network would — £ 

adapt the first layer weights to counteract exactly the sharpness ihf-Tcasc. 
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Figure 5.5: Training a Multi-Layer Signutn Network, Thruster Mapping 


In the second application, shown in Figure 3.5, the hard-limiting network is trained to 
emulate the optimal thruster mapping, which will be described in detail in the next section. 
For now, this mapping is used as an independent second tost of the t not hod. A similar 
dramatic improvement in hard-smiting performance occurs as noise increases past about 
0.3. It is not show n on the plot, but good performance is obtained at least up to a noise level 
of three. The training set for this mapping represents: continuous values being mapped to 
discrete values, sc the first-layer weights are high (indicating sharp decision hyper-surfares), 
even for noise = 0. 


5.7 Application to Thruster- Mapping Problem 

Iti order to demonstrate this new training procedure, it was applied to the: thru o : ; <ning 
with indirect training, as shown in Figure 5.6 or the top s-jelio" of Figure 2.13 v - . • ca.se* , 

the optimal mapping is not used, and the neural network must learn tliv Miapj. * * f .'rugh 
experimentation with the plant model. This require.*- back-propagation of error through 
the discontinuous tlrmters which motivated development, of the none injection method 
presented in this chapter. 

Training without this nofcednjectjon technique produce large errors, hecau-M* flic dis- 
crete-valued nature of the ihni.*ter £ is not Enforced during network training, and large 
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Figure 5. G: Thruster Mapping, Indirect Training Method 


roundoff errors result at run time. For example if one unit of thrust is requested in the 
+x direction, during training, the network will set T\ and Tj to +0.5; but at run time, for 
requested forces near 1.0. 7', and T-, are likely to both he 0 or both be .1, resulting in a large 
error. 


Thatsier Models 



Performance on Actual Plum, Different Model* 



rtm nufftWr 


Figure /S.7: Reiiiltii of I)idirec:t Training, Two Differentiable-Thruster Models 

TA<* ytgmouU based Hifproxifthitiofi (without uois>?) is hotter than the linear Mode!, 
hut has limited performance. The results from direct training represent a Jott er Jimit 
for comparison Mapping error it average percent error above the o p timal flapping 
(which results from an c?x/jiij5?ive search of nil possible thruster combinations). 

‘lhr shaded areas represent the mean ± <r i for ten tliffere.nl tuns. 3 — 10 - 4 layered 
ii'Auvrk* were nfeii 
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Figure 5.8: Results of Indirect Training, Noisy Tri- Level Sigmoid Thruster Model 

Left: the sigmoid shurpnesi factor (slope at the midpoint « 4) a id noise level /0.15; 
for the noisy tri * level sigmoid appear to be intuitively correct , Right: as ao.’se 
increases, performance approaches that of the network fiamrd directly (emulating 
the optimal mapping), with beet patfornianct'. at a noise level of about 0.15. 3-10—4 
layered networks were used 


Figure 5.7 shows the result of indirect training with two differentiable thruster mod- 
els. During training with the continuous thruster models, the neural network produces a. 
mapping with a very low error, which is not plotted here. However, when the continu- 
ous thrusters are replaced by signum thrusters at run-time, the error is largo, and Is the 
“thruster mapping error’' plotted in the light half of Figures 5,7 and 5.8. The errors are 
high because the network learned to optimize the solution using outputs that would be 
ut. available at. full-time. The resulting roundoff error i« unknown to the neural network 
during training. 

In Figures 5,7 and 5,8, each dot repu^ents the final poi formatter Mtsr a 10,000 epoch 
training run. The shaded r eg!ous represent r/uun :: cr performance for ten runs. 

Figure 5.S show.; the results when the thrusters are modelled by noisy tri level sig- 
moid >. With. now = 0, error is high, corresponding to the data in Figure 5.7. but 
mdse im reuses, performance approaches that of the network trained directly (emulating the 
optimal mapping). 
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The direct- training performance represents a lower bound set by the functional .com- 
plexity of the 3-10-4 layered network. The best noise value in this .application seems 
to be around Q.15. and the. resulting noisy sigmoid is shown in the left half of Figure 5.8. 
Examining this figure, the sigmoid sharpness and noiselevels seem to be set correctly, ac- 
cording to intuition. As noise increases beyond 0/2, error increases as expected (the "off" 
region of the sigmoid becomes blurred). Themiethod is fairly robust, to the noise value 
selected, and the effect of noise level on performance makes intuitive sense. 

A good solution results, when noise is added, because it prevents the network from 
using a solution that uses non-saturated portions of the trUevel sigmoid. Such a Solution 
would give a. nearly random output and high error during training. The training algorithm 
must find a solution that works well dispite the noise addition. This means the expected 
value of the output must b< 5 well into the saturated regions to work consistently well. The 
results approximate the optimal solution very well., and work when the tri-level sigmoids 
are replaced with tri-level signnms. 

5,8 Other Uses of Noise iii Related Problems 


Noise has been shown to be central to the success of this now algorithm. While this par- 
ticular-use of noise is new, artificially-injected noise has been used .successfully in previous 
applications for control, neural networks, ancLoptimization. 

In control and signal processing, quantization error results when an analog signal is 
sampled digitally (with inevitably finite precision) by an A/D converter. This effect was 
first studied extensively in the Ph.D. work of Bernard Widrow.. and published in [66], In 
analysing this phenomenon, the roundoff error may- be treated us a source of noise. While 
this wofk h&s Little direct bearing on the algorithm presented here, the presence of noise 
and foundof: error in the same problem is interesting. 

In control applications, it is common to add an artificial diiher signal to break the efforts 
of stiction. This dither h; usually chosen to cause a force ju*t large enough to overcome the 
static frictioft. .md is input at a frequency high enough that it do«/s not aflect the rost of 
the control system. Again, there is Little direct connection with the noh\ -sigmoid training 
algorithm, but it represents a previous application of artificial noise injection in control. 

In the- lie man vision system, the limitation of a finite number of receptors iu-tlio retina 
is overcome by the artificial addition of a dither signal. Very small, high-frequency motion' 
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of the ovc are used to allow people to sec thin far-away objects th&t might otherwise go 
unseen due Jo the finite number of receptors. 

When training a neuraJ network with a limited set of training data, one-approach to 
control the effect of averfitting is to duplicate the elements of the data set, and add different 
amounts of noise to each one, in ati attempt to increase the effective size of the date, set. 

Adding noise to. the weight updates has been tried, with-Some success, to iniprove the 
learning speed of neural-network training [3d], this is a similar concept to simulated anneal- 
ing, the addition of a. fandom element in the weight update rule whose magnitude decreases 
exponentially. The idea in simulated annealing is to prevent the common optimization 
problem of getting stuck in a local minimum. If the magnitude of the random element is 
decreased Slowly enough (i.e., the time constant approaches infinity), convergence to the 
global optimum is guaranteed. This gradual reduction in temperature is similar to that, 
in a metallurgical, annealing process - hence the name. Simulated annealing is a common 
algorithm i::t optimization for Systems other than neural networks. 

lr,. genetic algorithms, species are evolved using two primary methods to go from one 
generation to the next: (1) crossover, the combination of traits between competitively- 
selected parents; and (2) mutation. the addition of a random element in the next-generation 
ch romosome. 

While the above examples show that "In* concept of artificially-added noise for control 
and optimization problems h; well- tea te.n, the ust of noise presented in this thesis - to 
produce an accurate differentiable approximation to a DVF for gradient-based optimization 
- is completely new. 


5,9 Summary 

This chapter has described a new technique tha.f allow* baekpiopagatioa learning to work 
with systems rofliainiii*; discrete-valued functions, despite the disconlifiuity that exists be- 
tween discrete values. The modification to backpropagation is very small, simply requiring 
sigmoidal approximation of the discrete valued functions, and the careful injection of noise 
into the smooth appi oximating function on the forward sweep. The noise inject iuri is (fit; cal 
to ensuring that the noisy sigmoid behaves like a s.ghuin during training. 

Multi-layered, ?ieuw*rka of hard lin it«*rs require* simpler processing hardware Mian do 
mtilti layered aiginoid networks Sigmoid tietw.jiks are commonly used, however, due to 
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their increased functionality as well as the lack of a. reliable training algorithm for signum 
networks. Multi layered signum networks have now been successfully trained using this 
noise injection method in two different applications, clearly demonstrating its usefulness in 

this area. ® 

Application to a complex thruster-control problem, with implementation on a labora- 
tory model of a. free-flying space robot, has demonstrated Die method’s readability and 
usefulness for ou-otf control problems. 

In each application, the training behavior in the presence of noise has been well under- # 

stood, and the algorithm appears to be relatively robust to.the amplitude of the injected 
noise. 




Chapter 6 


Experimental Demonstration of 
Reconfiguration 


Experiments were performed on the mobile robot described in Chapter 2 to verify the ap- 
plicability of these neural network results. Position and attitude of the robot base are 
controlled while subject to multiple, large, possibly-destabilising changes in thruster char- 
acteristics. Thu plant is linear and well-modelled, except for the actuators, which arc on-off 
thrusters that could have altered characteristics. An off-board vision system provides high- 
bandwidth position feedback, which is then digitally filtered and differentiated to provide 
velocity feedback.. On-board accelerometers and an angular- rate sensor are used 1.0 provide 
base-acceleration measurements used by the failure-detection and control- reconfiguration 
capability. This chapter reviews the complete control system, and presents experimental 
results. 


6.1 System Overview 

Figure 0.1 shows t he overall system block diagram which was discussed initially in Chapter 3. 
In this chapter, each block Will he described in detail. 

The User issues motion commands with a inousc<hascd graphical u:;er interface (GUI) 
that runs on a Sun 1 workstation adjacent to the robot. The us<u vicv.*s an image of the 
robot that is updated with real-time data from the Position Censor described below. He 
or she can use the mouse to drag a ghost image of the robot to the desired final locaf/on, 


‘Sun if. a UAdemurk of Sun Mir rosy Inc 
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Figure 6.1: Reconflgurable Control System - Block Diagram 

This control system is baxed upon a conventional indirect adaptive controller, xucli 
as a selftuning regulator. Examples of the continuous-valued Fd ti vector and 
the corresponding discrete- valued T vector are shown. The ID block represents 
a recurs/ve-ieasl-squares identification of thruster strength and direction. This 
continually-updated model is patted to the neural network training WocA, shown 
in detail in Figure 5.0, The continually-updated neural thruster mapper is copied 
periodically into the active control hop. 


adjusting its position and orientation. The motion is- then initiated by clicking on a button 
that is part of the GDI. 

The Trajectory Generator receives the current arid desired position and velocity vec- 
tors and generates a quintic- polynomial trajectory; between the two locations. A quintic- 
polynomial means there are six coefficients of a polynomial function of time, These pa- 
rameters are chosen to match the initial and final position and velocity (four parameters) 
and set acceleration to icio at initial and final times (the two remaining parameters). „The 
duration of the stew is minimized automatically while not exceeding the pre-defined accel 
eration limits (corresponding to the limits in actuation). The result is a time history of 
desired state^A',*,,,, consisting of [ac,#,.,. y.j Ci » y,j C3t 

I ho PD Controller take* the desired state. X{,<, from the Trajectory Genera-.- 
tor. and the measured Male, A\ from the Position Sensor, The translational propor- 
tional and derivative gains are 32.5 N/i n and 9; N/(m/s) t resulting in closed loop poles 
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at a as *-0.65 ± 0.2; (neglecting effects of the on-off actuators). The Output of this 
component is a. continuous- valued desired force vector, Fdcs — <tu * j suc ^ as 

[0.9 N, -1.3 N.,Q.4N-mj, 

The Thruster Mapper takes the desired force vector, Fd**, and produces the thruster 
vector, y.\ tha* c&uses the thrusters to open or close. An FCA network is used to implement 
the thruster mapper. Like the rest of the. low-level control loop, it is written in the C 
programming language and executed on a Motorola® 68040 processor (MVME 167) ou 
the robot. The real-time. control system was developed with ControlShell 2 development 
software and the VxWorks 3 operating system. Details of the network are described below. 
The signal flow of the thruster-mapper component is shown in Figure 6.2. The final output 
is the binary eight-element. v<>ct or of thrusters to fire, T = [Ti Ti 1$ T 4 T$ T$ 3V 2$]. 


Thruster Mapper 
periodically copied to robot 
froa. training process 



Figuie G.2:- Thruster Mapper - Signal Flow 

The signal Sew. of the “thruster-mapper" component shown in Figure 6 l i.s pre- 
sented. The mapper produce* a T mdf . vert or based upon the desired force, but 
this signal may be changed by the 'tire Control Module'* during the identification 
process. A list of thrusters to excite, in provided by the *3D ,J component. 

A FirtOniQnly signal is also used to Simplify the idsfltjficfttio/j by limiting firing 
to one lihri/Ster at a time. Roth ofthix# ID-related functions ttuiy be over-ridden if 
the tracking error, .Y« fr , /s too high. The parameters (neural network weights) that 
define the function implemented by the thruster mapper arc periodic ally copied 
from the neural-network training ptocear 


'ConttolShell is a trademark of Tit.d-Timt Innovations. Inc. 
J WiNVurka ia a tndcliUik of Wind Riv<?t SyMmi* I in . 
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The Robot has a mass of 70 kg, floats nearly Motionlessly on t he.gr* nlte table, has 
eight thrusters, each. nominally producing 1 Newton of thrust. Since control of the robot 
manipulators was not relevant to this research, the arms are commanded to m aintain a fixed 
position a.t all times. This involves RVDT sensors, art analog pro- filtering and differentiating • 

circuit, A/D converters, a F'D controller for each of the four joints, D/A converters, motor 
driver boards, and finally, the brushless DC motors and cable drive system -that actuate 
the arms. The arm endpoints are equipped with pneumatic plungers, allowing the robot to 
capture a free-floating target Object . • 

The Position Sensor is a pair of CCD cameras mounted to the ceiling above the 
robot. Two cameras are required to cover the total Surface area of the 2.74 x 3.05 meter 
(9 x -12 foot) granite table. The cameras detect a pattern of LEDs mounted to the top 
of the robot. A custom vision processing board processes the camera output, and produces £ 

position information at a 60 Hz update rate that is accurate to better than 1 mm. This 
(z. y, V’) vector is digitally filtered and differenced to produce a velocity vector. The 
processing is performed off-board and then communicated back to the robot via a wireless 
Ethernet data/communicatious link. • 

The Sample Rate for the low-level control loop was chosen to be 10 Hz. This is 
more than an order of magnitude faster than the PD controller bandwidth, and is slow- 
enough to- allow transient, acceleration effects to die out, leading to the accurate acceleration 
information needed for reconfiguration. If reconfiguration is not required, the sample rate • 

can be increased to 60 Hz. Sampling fester than that produces no benefit , since the vision 
system operates at 60 Hz, and the thruster bandwidth is approximately 30 Hz. 

Summary of the signal flow in the low-level control loop: LElDs or-, top of the 
robot emit infrared light. CCD cameras on the ceiling receive the light, and semi the signal # 

via a coaxial cable to the custom vision processing hoard mounted on a fixed rack adjacent 
to the granite table. The "pointgrabber” vision board scans file image for bright pixels. 

When the known pattern of LEDs is located, the vision board calculates the orientation and 

geometric center of the robot. Velocity is also calculated on the vision board by digitally 0 

filtering the position information. The 6- element robot state vector is broadcast to the 

robot at a 60 Hz update rate (and less than. 30 ms total time delay) over the Motorola 

Altair wireless Ethernet system. The robot then sends this information back u> the user 

interface running on a Sun w orkstation. The oti-board microprocessor takes the statp \ector £ 

and uses the PD controller to calculate the desired force, cot vert tu robot coordinates. 
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and uses the Thruster Mapper to calculate the thruster vector (e.g. [1 0 1 0 0 0 0 1]). 
This vector is sent over the V;4£ backplane to the digital I/O board, which then controls 
the opening and closing of the eight solenoid valves- This releases. air from the 100 psi 
reservoir out through the converging-diverging nozzles to produce one Newton of thrust per 
thruster. 

The Acceleration Sensors arc described in detail in Chapter jh Two accelerometers 
are mounted on the base orthogonal to one another, along with an angular-rate sensor. 
The acceleration signals and angular-rate .signal are pre-filtered to remove the effects of 
extraneous vibrations. The filtered signals pass t hrough an A/D converter, and then through 
the VME backplane to the microprocessor. The base translational acceleration vector is 
derived by subtracting centrifugal accelerations (calculated using angular-rate information) 
and converting to the robot frame. The angular- acceleration signal is obtained by digitally 
filtering and differencing the angular-rate signal. The [x, y , C'] vector of the robot base is 
the output of this component, as shown in Figure 6.3. 



Figure 6.3: Acceleration Sensors - Signal Flow 

The signal flow of .the “a crehmdon sensors” component shown in Figure 6.1 is 
presented. The accehrou inters ate filtered with analog and digit*! filters to produce 
the Accel #J arid #2 signals, The angular-rate sensor .HignaJ is similarly filtered , 
with the Additional fit Cp of a digital difference, which produces an tvujJ as v . v is 
output directly, .white ti : is used to compensate fot centrifugal accelerations measured 
by the accelerometers. The acceleration signals are then rotatioiiHlly t rhiisfatinc.il 
to align with the x and y coordinates of the robo f. IVhen the angular-rate sensor 
'saturates, angular rate awl acceleration derived from the overhead vision system 
are used, as indic ated hy (lie logical switches. 


The ID component identifies the characteristics for each of the eight thrusters This U 
described in detail below. At a simple level, it takes in the acceleration vector and thruster 
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vector, and performs a recursive liner r i -n to identify the thruster parameters. The 

more-complicated factors, such as faiv- .<•' and thruster excitation, are described 

below, and summarized in. Figure 6.9. A regression may be. used here, since the 

forward model of the. thrusters is linear, e.g. firing thruster (j) may .produce -1.03 N in the 
x direction. 0.07 N in v, and 0.137 N-ni in ti. The result is a 24-element matrix, containing 
tiie thrust produced by each Of the eight thrusters in each of the three degrees of freedom. 
This is the ‘’robot model.” indicated in Figure 6.1. 

Thruster Mapper 



Figure f>A: Neural-Network Training - Signal Flow 

The signal Sow of the “NN train" component shown in Figure 6.1 is presented. The 
model used in training is updated by the ID process, and the neural-network t hruster 
mapper developed here is copied periodically i o the thruster tnhpper running or> the 
robot. The algorithm used to adapt the neural network based on the error signal js 
shown in Figure &.G. 

The NN trnin component Is responsible for redesigning the thruster mapper to account 
for changes in the rebut model. Jt waits until a major chr.ng? i?; detected, calculates a 
linear mapper, and implements it on the robot using the FCA, described in Chapter -1. 
When smaller changes occur (as the ID process converges), the model used for training 
is updated. If further major changes are detected, the network is reinitialized to a newlv- 
calcuialed linear mapper. Indirect training is performed using the modified backpropngafion 
algorithm described in Chapter 5. The thruster mapper being t mined is copied periodically 
to the thruster mapper running on the lobot. The network is grown gradually, res ulting.in 
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a fast initial learning, ra te, Th£ details of the training are presented below, and summarized 
graphically in Figure. 6.4. 

Summary of the signal flow in the adaptive system: Accelerometers and art 
angular* rate censor measure motion, of the robot base. The raw signals are prefiltered 
on*board, pass through an A/D-converter to the microprocessor, where the dynamics arc 
accounted for, and the base acceleration vector is computed. This signal is transmitted 
using the wireless Ethernet to a Sun workstation that is running the. ID process. The ID 
process forms the robot model and transmits updates to the NN training process running 
on another Sun workstation. The updated neural-network thruster mapper is copied 
periodically to iht robot via the wireless Kt hornet, where it is substituted for the thrunter 
mapper running in the control loop. 


6.2 Trajectory- Following Performance 

Before the reconfiguration capabilities afe presented, trajectory-following performance with 
al?. thrusters working is discussed. This serves two purposes. First, it demonstrates that the 
base-control strategy of separating the thruster-control system into a control component 
and zx thruster-mapping component is valid. Second, it demonstrates that a neural-network 
emulation of the search based thruster mapper (which is optimal) can providc.M^r-optinial 
performance. 

When evaluating performance, the effects of the on-off actuators should he considered. 
Due to the control structure, PD- control gains, and thrust or*mapping. cost function selected, 
a deadband e>dst$ within which the thrusters will not, fire, even with an optimal thruster 
mapper. While the size of this deadband is difficult to characterise due. to the thruster 
coupling effects, the maximum static deadband (assuming aero velocity error and error in 
one degree -of freedom only) is approximately 2.9 cm in translation and 10.0° in yaw angle 
(with the nominal thruster configuration ). 

6.2.1 Trajectory- Pol lowing Performance: One Degree of freedom 

Figure 0.5 shows tho-trajectory following performance for a single-degree* of- freedom ma- 
neuver. The robot base position is commended to follow a fpiintic* polynomial trajectory in 
the directiomJThe trajectory parameters are chosen to achieve the desired final position 
while setting initial ami final velocity and acceleration to zero* Because of this* a couple 
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of seconds pass before the thrusters fire, even though the trajectory begins at t-0. The 
duration of the maneuver is- set automatically, by keeping the peak acceleration within the 
actuation limits of the robot. In this case, the 1-metcr slew took 20 seconds. 

Single-axis Trajectory Following 




suits 


Trajectory-following performance is plotted for a quintic-po lyr> cm i a I trajectory 
of length 1 rrietef a ml duration 20 seconds, in the 4** direction. The nominal 
t hvustet configuration is' present.. An FCA network with 5 hidden neurons pro- 
vides trajeetory-foiJowing performance cornparabte (o t/iai of the optima/ thruster 
mapper, which is implemented via exhaustive search. 


Th* 1 control system used is the one described above, except that no adaptation is re- 
quired. Two different thruster mappers are used; a neural-network mapper implemented 
with an. FCA network with 5 hidden teutons; and an optimal thruster mapper, implemented 
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via exhaustive search. Both mappeis are aided by symmetries (as described in Chapter 2). 
Although the neural mapper is sub-optimal (mapping performance on a set of test data 
resulted in average force errors .3.5% greater than the optimal mapper), the trajectory- 
tracking performance is comparable. Due to the presence of feedback, the 3.5%. mapping 
error is not significant, considering the other disturbing factors, such as imperfect thruster 
characteristics (steady-state and transient), sensor noise, and deadhand. 



Figure 6.6: Multiple- Degree-of- Freedom Trajectory Following 

The initial, middle , and fin ft/ positions are illustrated for u muiti-coordinafe ma- 
neuver (r, y, \!') Quin tic* polynomial trajectories are followed sitnuUntwously in 
each of the three degrees of freedom. Ihc position of the robot s geometric center 
obtained Ujiiug the FCA mapper is aho plotted (heavy black Itn#). 


6/3.2 Tk'&jectory-Folfowing Pt rforniatice Three Degrees of Freedom 

For the muhi-cooidinate maneuver ([j\ y. V’]) shown in Figure G.G.good trackings obtained 
again from built optimal and neural-network thruster-mapping components. In this 22- 
secunddoftg trajectory, the robot simultaneously* translates 1 meter in the -far direction. 
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1 meter in the +y direction, ancl 180° in. the +t> direction. The position of 'the robot’s 
geometric center is plotted in this figure. Quintic-polynoniial trajectories arc used for each 
degree of freedom. Each is executed simultaneously, with the peak acceleration for each 
degree of ■freedom limited to the physical actuation limits. 


Trajectory Following, FCA-Neural-Network Thruster Mapper 



Trajectory Following. Optimal Thruster Mapper 











Figure 6.7: Multiple -Dbgfee- of- Freedom , l?r&jectory Following, Experimental Re- 
sults 

Trajectory- following .error for the multi-coordinate maneuver illustrated in Fig- 
ure 6.6 is plotted for each of the three coordinates, it, y, V')- The pertornraiu-e of 

the FCA Mapper with 5 hidden neurons is excellent, and is comparable to ihnt of # 

the mapper implemented with exhaustive search ("Optimal Thruster Mapper" j. 


Trajectory-following errors for this multi-coordinate maneuver arc plotted in Figure 0.7, 
providing a coin parboil of the neural and optimal tlirustei mappers. This experiment 
used the same ton'.rolleru used for the s-ngle-dcgi ec-of-lrecdoni maneuver described above. 
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Again, the performance ir excellent, a.nd comparable results are obtained from the neural 
and optimal mappers. 

6.2.3 Trajectory-Following Perfornlance: Summary _ 

This preliminary experiment has verified the applicability of the non-adaptive portions of 
the neural network control system. The neural mapper was shown to provide trajectory- 
following performance comparable to the optimal mapper, which was implemented by ex- 
haustive search. As discussed earlier, the advantages of the neural- network approach do 
not apply strongly in this application until there is the requirement for reconfigurability. 


6.3 Control Reconfiguration Problem Definition 

Figure 6. 8 shows the thruster layout in the nominal configuration a$ well as an example of 
a dramatically-failed configuration. The magnitude and direction of each thrusiter is shown, 
Nominally, each thruster produces l Newton of force, directed as shown* The failures were 
produced by mechanic.illy changing the thrusters. Failures include: half-strength ((5)), 
plugged completely (®). angled at 4b c (($) and ®), and angled at 90°«t> and ©). Thu 
90° failure modes plao* high demands on the control-reconfiguration system, since they 
destabilize the robot (changing the direction of torque* results in positive feedback!). 

Requirements for the reconfigurablo control system include: 

1. The robot is not informed of the nature of these failures, or even that a failure has 
occurred. The adaptive system must first detect the failure(s), then identify the 
new* thrilfiW chat act erUtka, and finally Jrain and implement a now neural-network 
thruster mapper tha': account.* for those changes. 

2. Control must be maintained at all times, but artificial excitation i;* allowed when 
position errors are small. This* requirement keeps the robot within the bounds of the 
workspace (i.c. on the table), and allows it to carry on with its mission during the 
reconfiguration. For example, in this case, the robot ran be commanded to move 
throughout the workspace during reconfiguration. 

3. The .entire adaptive system, including IDLaiid re t raining, is to he aufcuioitiaiiK T re 
quiring no ijhoi intervention .at all. 
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Nominal Configuration After Multiple Failures 



Figure 6.8: Example Failure Mode 

Magnitude iind direction of inch of the eight thrusters is shown. Thruster failures 
ivrre .simulated mechanic. ally with weaker thrusters aud 90° and 45 0 elbows. Some 
of the elbows </e$tfthi7i'?r the robot by changing the Sign of thrust in the v- direction . 


Six out of eight thrusters have failed in the case presented here, Thor* is no theoretical 
limit to the number or typo of failure that can be identified And he accounted for by the 
reconfigurahje coitt.ro! system. However, there is a limitation if the controllability of the 
lobot is impacted. For example, if both thrusters on tho front of the robot ((F) and (§), 
ns labelled in Figure 6.8) had failed completely, and no other thiustors contributed force 
in that direction (-.t), there would be no actuation authority in the -x direction. If it 
were necessary to accommodate failures like this, a higher level process (perhaps part of 
the trajectory generator) could command the robot to rotate, bringing working thrusters 
in line to provide the required thrust. In the example failure mode tiliown in Figure G.8 t 
there is sufficient actuation authority in plus and minus directions for all three degrees of 
freedom, so this issue is not y<P: addressed here. 

To 5iuiUtiari.#‘, tlm lusic reconfiguration strategy ij; to first detect the failures), then 
identify the new thruster characteristics, and finally train and implement a new neural* 
network thruster mapper. The structure of the control nysfem is summarized in Figure 6.L 
Running the adaptive 1 process (flourish network training) in parallel with the identification 
process leads to stabilization Within seconds, and causes tho robot tn be weh-coiiUolled 
during the identification, 


• - 
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6.4 Identification of Failures 

Before reconfiguration can occur, the failures must be identified, The control system is not 
informed of the number or type of failures beforehand. It must, detect, and subsequently 
identify*. each of the failed thrusters. Failure detection and system identification are closely 
related in this implementation, so they arc presented together here. The signal flow of the 
identification process is shown in Figure.^, £>. 



Figure t>.9: Identification Process - Signal Flow 

Input* arc the thmseer commands and fi.cceter&twn signals, sampled from the rca/- 
rim System at JO Hz. The primnty output is (he mode/ of thru <trsr c/nrart eristics, 
a 3 >. 8 matrix containing the forward Mapping from thrusters to accelerations. 
Addithnai output, and t ireOncOvly, ate us«?d ill (he control loop as pa/f 

of the artificial excitation process 


6* *4,1 Identification Summary 

The task is to take in oceolcration signals, (.r, .v, »,>), and thruster commands* and form a 
model ol th'i strength and direction ofe.vii thruster. Since this is a purely linear relationship, 
there is no need for a neural network, and a linear* systems approach works well. When the 
thruster model is found to deviate from the uuniiuah a thruster failure h. “suspected" Tlu? 
thruster in question will he excited artificially to obtain more information abou r : inpefcdiug 
up the idcr.tificatior« process. When & certain level of confidence is reached and the new 
characteristics of tin? suspected thruster are confirmed, the artificial excitation is turned 








1 
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off. Throughout-the identification,, model updates are sent W> thfc neural network training 
component. This procedure, explained here for one thruster, runs in parallel for each of the 
eight thrusters. - 

Them are a number of complicating factors for the. System identification process: multi- 
ple thrusters may be. fired simultaneously; the Acceleration signals are corrupted Jbv extra- 
neous mechanical vibrations of the .robot -in the frequency range of interest; the response 
time of the thrusters is on the order Of the Sample period; and variations in the reservoir 
pressure during the firing of multiple thrusters affects the thrust output. These problems 
are addressed by filtering, reduction of the sample rate to 10 H/., and design of the system 
ID process (e.g. waiting for a certain, confidence level to be reached - i.e. collecting enough 
data - before declaring a thruster failure) 

At the heart of thd identification process are two recursive linear-regression processes 
running in parallel, incorporating acceleration and thruster signals as they become available. 
Each linear -regression yields a 24-parameter model containing the r, i/, and p acceleration 
associated with each of the eight thrusters. 

6.4.2 Failure Detection 

The first recursive linear-regression process is used primarily to detect when a failure has 
occurred for each thruster. This “Failure* Detection" process has a weighting factor that 
causes it to focus on tiie most recent few seconds of data (the weighting parameter decays 
exponentially in time - a “forgetting factor"). The time constant of the exponential decay 
was chosen to allow quick response to a failure, but still allowing enough data collection to 
present premature failure declaration. 

This procusti, shown in Figure G,9, is initialized with ;i model of ihe-nomiiial. thruster 
configurate.: however, due to4he forgcttingTavtoi, the model can change quickly based 
upon 7.ew_da,ta. The recursive process propagates a. model (3 x 8 matrix representing tile 
best es^mate of the acceleration resulting from each thruster) andxovarianc«i-iTiatrix (8 >; $ 
matrix representing the amount of information collected for each of the eight thrust ors). 

tvery time a thruster is fired, the ID process collects mote information about that 
thruster, leading to a higher level of confidence in the estimate of the model parameters for 
that thruiitet. A “confidence factor" is calculated by taking the diagonal-terms from Urn 
inverse- of the covariance matrix. 1. •: to the forgetting factor, the confidence factor cldcs 
rot rise monoton icaily - if will fall if the thruntef is no' ftf-erl fm some time. 
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The model generated is compared with an “Accepted Model.” The “Accepted Model” 
is the overall best estimate of thruster. Characteristics, and is the one sent to the neural- 
network-training, process. It is set initially to correspond to the nominal thruster configu- 
ra.tion,_but may be updated by either of the two recursive linear-regression processes. 

If an error in the model is detected .(i.e. difference between identified and accepted 
models exceeds a. certain threshold) for one or more thrusters, and the confidence level, for 
the thruster(s) is high enough, a suspected thruster failure is declared. This decision process 
is shown as the LOGIC ELEMENT in Figure 6.9. When this condition is met, three things 
happen: 

1. The suspected thruster is added to the "List of Suspects.” 

2. A reset signal is sent to the "Model-Building” Recursive Linear-Regression process. 
For the newly-suspected thrutter(s) only, all prior information is to be erased. This 
is achieved by inverting the covariance matrix, zeroing the row and column corre- 
sponding to each newly-suspected thruster, inverting this matrix (setting the diagonal 
element to a small number so the inversion is possible), and netting the covariance 
matrix equal to this quantity. This lias the effect of eliminating any prior information 
concerning the newly-suspected thruster, while leaving the rest of the model intact. 

ii. The identified model (fc*r the newly- suspected tliruster(s) only) is copied to the “Ac- 
cepted Model.” This is shown by the closing of the switch in Figure 6.9. This new 
"Accepted Model” is then sent immediately to the neural-network training process. 
There, a linear approximate solution is calculated immediately 4 , infused into an FCA 
network and copied >;o the robot. The result is a near-instantaneous stabilization of 
the rcbot, once the thruster failure has been detected. 

Oftce a thruster is suspected, it Will not be labelled as a suspect again until the adap- 
tation process is reset. It will remain on the list of suspects until it is removed by the 
"Model-Building” Recursive Linear- Regression process. 

'The (i prii’fi liriCii r telniion toed hen- tv ft* fniiml t»y axsiimirifi that the thiilsii-rs .incapable of continuous, 
valued ihtust output (a Unearned version of this problem! The solution it an K x ;t pseudo-inverse of the 
3*8 matrix which map- thrusters to hast* fortes, /■ . Some simple adjustments are then made to aceeunt 
fur the one sided aspect of the thrusters (i e. they can not produce negative thrust). 
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6,4.3 System Identification 

The second recursive linear-regression process is used to build the model 6f the thrusters 
that hi usedJbr. neural-net work training. This u Mo&eM3uikling”~ process docs not have a 
forgetting factor - it incorporates all of the information equally, so the result is exactly the 
same as if a single batch -least- squares identification were run using all of the_data. 

This model is meant to be stable, basically changing only after a suspected thruster has 
been flagged. For this reason, it is initialized to the nominal model with a high level of 
confidence, and therefore 1 does not vary significantly with random fluctuations in the data. 
However, when a thruster is flagged as being suspected by the “Failure-Detection 1 * process, 
all information about that thruster is eliminated, as described above. Information about 
the other thrusters remains unchanged. New information about the suspected thruster is 
then incorporated into the ID process, and it reacts quickly to the new situation due to the 
elimination of old information. 

The model and covariance matrices are updated recursively as new data comes in, as 
witli the “Failure- Detection 1 ’ Linear Regression. Since there is no forgetting factor, the 
confidence factor rises monotonirallv. When certain levels of confidence are reached and 
error criteria are met, the “Accepted Model" is updated. When confidence reaches a high 
level, the thruster(s) in question will be removed from the “List of Suspects. 11 

During the time between first suspicion and final confirmation, the thruster in question 
is excited artificially, as described b<do\v. 


6.4.4 Artificial Excitation 

When a thruuter is suspected of having failed, an artificial-excitation method will cause that 
thruster to fire mote than it normally would, allowing for more information to be collected, 
and ultimately, expediting identification, The excitation is achieved with two basic methods: 
( l). whon-positirui-fontrol errors-are “small.” a. thru si ter may be fired open-loop foi: a brief 
period of time (until the thrusier characteristics arc- identified or the errors a to no longer 
*\smal!; , ‘ (2J when position-control errors are ‘‘medium, 1, the thrusters i hat are targeted for 
excitation aru used exclusively for closed-loop control. When when position-control errors 
become “Urge,” artificial excitation i<; suspended until the errors-ars reduced. 

Tin.’ excitation is controlled b\ two signals sent- from the idefif 1 fir at inn pinces-; in the 
robot: 
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1. A list containing which, if any, of the eight thrusters should be subjected to artificial 
excitation: T t zdu- 

2. A TRUE/FALSE command indicating whether the robot should limit itself to firing 
one thruster at a time: FireOneOnly. 

List of Suspects 

The “List of Suspects” component in Figure 6.9 keeps- track of the suspected thrusters. 
Thrusters are added, to the list by the '‘Failure- Detect ion" component, and then removed by 
the “Model-Building” component once their now characteristics have been confirmed (and 
possibly a confirmation of no change, if the initial failure-detection signal was erroneous). 

FireOneOnly 

If the “List of Suspects” contains any thrusters, FireOneOnly is set to be TRUE. Firing 
of multiple thrusters complicates the identification process, and identification accuracy will 
be improved if firing is: limited to one thruster at a time. However, keeping the tracking 
error low i.s a priority, and may override this limitation. The flow of signals is summarized 
in Figure 6.2. 

List of Thrusters to Excite 

When suspected thrusters exist, they are copied directly to the List of Thrusters to Excite, 
and are sent to the robot as T, Z cif. shewn in Figures 6.2 and 6.9 When all suspected 
thrusters have been cleared by the ‘Model-Building” process, any thrusters that have not 
yet been identified to a -high level of confidence are added to 7' {rCjlc . The logic behind this 
is that if some failures have been detected already, then whatever caused litem (such ns a 
plumbing failure, micro-meteorite impact, or intentional damage imparted bv a graduate 
student) may have caused other as-yet -undetected failures, and identifying thnfti quickly is 
important. 

Thruster excitation will be at tempted as long as at least one thruster remains in 
If the robot position error is “large, '' no excitation will be used the robot is most concerned 
with maintaining control. If the position error is “medium, " and FireOneOnly is. sot to be 
Till' E, the robot will fireexactly om> thruster. The thruster is chosen by finding the thruster 
from those m whose curiontly-osiimated characteristics best matches the desired force 
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vector. In this middle .region,, artificial excitation takes place, but also serves to control the 
robot. If the position error is “small,” the robot will fire exactly one thruster. The thruster 
is chosen by finding the.thruster from T tT citt whose currently-estimated characteristiesjttost 
differ from the nominal. Some hysteresis is added to prevent chatter across small/medium 
and medium /large; boundaries. 

This artificial excitation method leads to quick identification. For the Case presented 
here, with 6 of 8 thrusters failed, the ID process consistently takes less than 60 seconds 
from when the first thruster is fired until the last thruster is identified to a high level of 
confidence. 

A reconfiguration example, including error and thruster-firing plots, is presented at the 
end of this chapter. The effects, of the ID process and neural- network training will be 
presented there. 


6.5 Nsural-Nefiwork Training 

The system identification process can be completed less than 00 seconds, due to the artificial 
excitation. However, the control system requirements do not allow the system to remain 
unstable for that length of time (the size of the granite table is the limiting factor). Use 
of linear approximate solution?, implemented via the FOA provide stability, but with a low 
level of performance. Running the neural- net work training process in parallel with the 
ID process results in higher- performance control, as the nonlinear capabilities of the neural 
network optimize beyond the starting point of the linear approximation. The neural- network 
training process is shown in Figures 5.6 and 6.4. 

The neural-network training in not activated until the first thruster failure is detected. 
From this point ^n, it is running, continuously, using the most-recent thruster model provided 
by the IL) process. When a significant change is detected, such as the total loss of a 
thruster, a linear solution is calculated, and the network startis froiu-sciatch* with the linear 
solution Input via the FOA. When small changes are detected. :>uch as the convergence of 
an identification oft tie new final value, learning is continued with the updated model. 

The performance of the neural-network thruster mapper is evaluated periodically on 
a ‘Most set' 1 of thruster-mapping input-output patterns. If the performance (a weighted 
combination of force matching and-gas conservation) is better than the tes*-srt performance 
of the Ihriistm mappm currently on the robot, it will be copied to* the robot. Tin? Copying is 
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performed by sending the FCA weight matrix (as m Figure 1.2) over the wireless Ethernet. 
The neural-network function running on llie robot swaps in these new parameters, resulting 
in an. instantaneous change in the functionality of the thruster mapper. 

6.6 Rapid Reconfiguration 

One of the major issues in neural Control is speed of learning. This is important in the robot 
application due to the goal of stabilizing ar. unstable system within a limited workspace. 
IUpid reconfiguration has been achieved hero, and it is due to a combination of two aspects 
of the learning process; first due to the FCA, and second due to the growing of the network. 

1. The FCA h*lps before training begins by immediately giving the network a good linear 
solution that stabilizes the robot. 

2. The neural network is grown during training. This refers to starting with a few hidden 
neurons and gradually adding new ones as training progresses. With few hidden 
neurons, very-quick learning takes place, since fewer computations are required, and 
fewer training patterns are required (to avoid overfilling). As more hidden neurons 
arc added, the learning rate slows down, but the greater functionality can be used to 
further optimise performance. 

The network begins with inputs. 8 hidden neurons, and 8 outputs, and gradually 
grows to 30 or more hidden neurons- as training progresses, New hidden neurons are 
added when performance begins to plateau. To prevent overfilling, the training-set 
size is grown proportionally with the number of hidden neurons, With this arrange- 
meat, a mapping with about 30% error above optimal results in 30 seconds, 20% 
above optimal within 00 seconds, and 10% above optimal' within 300 seconds run- 
ning on a Sun Sparc 10 workstation. Ai> more hidden nouruivs arc added, the network 
performance approaches optimality, but at the expense of slower training, 


Due to tli‘? use oj dist ret e-v, lined nctu.vtoj:?, ih^re is tJumU alv>,\v* <t foe** • * r r f * r \n lof I hv error vnJue 
reported here indicates that t ho average tua£culu<l< :>f tie- fore-: error ve.uu is MU tutus thr niAurntudr 
achievable with .vi m-aM 
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Figure 6.1.0: Experimental R esult s of Reconfiguration 

[,r, j/ t O) position errors (desired * actual) Arc plotted during Autonomous reconfiguration of the 
control system in response to the six severe t hruntcr failures rfhou'/i jn Figure 4.S. Static control 
deadband is approximately ± 3 cm in translation and ± 11° in rotation. The robot begins at 
rest within the deadband, i $ disturbed At t s 0, stabilizes itself within 1 seconds, and completes 
identification (aided by artificial excitation) after dS seconds. Tic n£ufa/-nefctvaifc thruster mapper 
continues to optimize after the identification is complete. Thruster signals are shown in lower plot. 
Itt&ck rtctuhgdlar regions iudic&t* periods of thruster firing. Darkly-shaded regions indicate the 
time during Which the thruster w.isi *u sported. In addition to artificial excitation of the suspected 
tJirustr/s, ( xcitation of unsuspected thrusters is used to expedite the i dentific ation prove**. These 
period* arc m<fjrr.frd by the lightly-shaded 


6.7. EXPERIMENTAL RESULTS OF RECONFIGURATION 


10 ? 


6.7 Experimental Results of Reconfiguration 

The previous .sections have provided a description. of the structure of the control System 
used for reconfiguration, as well as detailed descriptions of several of the key components^ 

In this section, experimental data from a typical reconfiguration to recover from multiple 
destabilizing thruster failures is presented. Figure 6.10 plots the position errors (desired • 
actual) and thruster firing histories during the reconfiguration. 

The thrusters have been misconfigured severely, as in Figure 6.$. Before t = 0, the 
complete control system is active, hut no thrusters fire, since the robot is drifting within 
the control deadband. With no thrusters firing, the thruster failures do not cause problems, 
but they also cannot be detected. 

At t = 0 Seconds, o small disturbance is applied to the robot. One of the first thrusters 
to fire is thruster @ (shown in Figure 6.8 and the upper right corner of Figure G.10), 
which is destabilising in yaw, causing the robot to begin spinning out of control. The error 
signals in ail degrees of freedom grow significantly following the disturbance, as seen in 
Figure 6.10. The robot spins to its left, causing tactual to increase and terror to decrease 
(terror == tdt.sirti “ tacuai)- The lower half of Figure 6*10 shows how thrusters (5) (s) and 
(6) stay on almost continuously (indicated by the black regions) due to the instability. 

During this time (f & 0 — 4 seconds), the “Failure-Detection" process, shown in Fig- 
ure 6.9, has been collecting data. At t = 4 seconds it declares failures in thrusters @ (|). 
and (§). This triggers a series of events, all occuring at t = 4 seconds: 

L The “Accepted Model” is updated with the new parameter estimates -for thrusters 
® ® and as identified by the “Failure* Detection" process. The exponentially* 
forgetting linear regression weights recent data more heavily than old data, so the 
model built between t = 0 and t = «1 may be used effectively as a crude first approxi- 
mation of the characteristics of thrusters ( 1) j | \ and ©. 

2. The new “Accepted Model" is sent-to the neural-util work-training process, where a 
linear solution is calculated immediately and implemented <m the robot in the form — 
of an FCA network. The model at this point is just a rough estimate, ami the linear 
controller is far from optimal, yet these methods combine .tc result in the immediate 
stabilisation of the robot. 
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3. The neural-network- training process begins now, af f * 4 seconds, using the above- 
mentioned linear solution as a starting point, for training a new thruster mapper to 
accommodate the updated model. This process continues indefinitely:- model updates 
are received from .the identification process and incorporated into the trainings if the 
change in model is small (such as the change in force estimate from 1.03 N to 0.95 
JV), training continues; if the change is significant (such as the initial detection of a 
major failure), the training is re-started, .with the linear solution as a starting point. 

4 . Thrusters (4) © and © are added to the “List of Suspects'* (shown in Figure 6.9), as 
indicated by the darkly-shaded areas for Uirusters © © and © in Figure 6.10. 

5. The I'erc.te vector is set to (4 f> 6] and sent to the robot, along with a TRUE 
FireOncOnly signal. As disc us Sod earlier in this chapter, a TRUE FtreOncOnly 
signal means that the controller will fire only one thruster at a time (to obtain a more- 
direct identification), unless the regulation error becomes excessive. Furthermore, it 
will select thrusters to fire from only those that are listed in the T eJC ciu vector, again, 
unless the regulation error becomes excessive. Tills will expedite the identification of 
these newly- suspected! thrusters. The effect of these actions is immediately apparent 
in Figure 6.10: after t = 4 seconds, only one thruster is fired at a time, and t he firing 
of thrusters © © and © is favored. 

6. The ‘•ModeLIluilding” process, shown in Figure 6.9, is reset for thrusters © © and 
©. That is, all information about thrusters © © and © in this model is immediately 
and completely eliminated, while the information about thrusters © @ © © and © 
remain unaltered. Since a dramatic failure. has boon delected for thrusters © © and 
©, these models are built from scratch T .beginning at t = 4 seconds. 

Each of the 6 items mentioned above occurred at t = 4 seconds. 

The cumulative effect of these events at t = 4 is immediate and dramatic. The robot is 
stabilized .immediately, as soon by the leveling off of position errors. This rapid stabilization 
is made possible by the quick estimation of what thrusters © © and © are doing, and the 
subsequent linear control design and implementation as an FC A- neural-network thruster 
mapper on the robot. The errors can be seen to increase initially due to the momentum of 
the robot, and it takes a few seconds for them to turn around, due to the limitation to firing 
of one thruster at a time, blit the restoration of stability is clear. The initial identification 
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is so fast in this. case that errors never grew to be “large.” If they had. the restriction to 
firing one thruster a.l a time would have been lifted until the errors were reduced to lower 
levels. 

At t = 6 seconds, thruster (2) is suspected and the entire process described above is 
repeated: a linear mapper is calculated and implemented via FCA; T exa(4 becomes [2 1 5 6]: 
arid the “Model- Be tiding” process is reset for thruster (2). The position-error results are 
less dramatic, as the failure of thruster © is not destabilizing. 

At t = 8 seconds, the “Model- 11 uilding” process reaches a Sufficient confidence value . 
for its estimate of thruster (§). It updates the “Accepted Model” and removes thruster © 
from the “List of Suspscts” and then from T. tc it*. This is indicated on the plot by the 
termination of the darkly-shaded region for thruster ©. It stops firing at that point, as it 
is no longer subject to artificial excitation. 

At t - 13 seconds, thrusters 0 and © are confirmed similarly, as is thruster Q) at 
/ s 15 seconds. Observation of the thruster-firing histories and error plots shows what is 
happening during this time: the suspected thrusters are excited artificially, One at a time, 
resulting in a more-accurate identification. The regulation error is kept roughly constant 
during this period - i.e. within the bounds acceptable by the- artificial-excitation process. 

When thruster (?) is confirmed at t = 15 seconds, no thrusters remain on the “List of 
Suspects.” The remaining as-vet-unsuspected thrusters, [1 3 7 8], are added to the 7V mfl , 
vector. They do not. fire immediately, as.tlie error is too high, but once it is .within acceptable 
range (after thru "ter @ is used to reduce the error), they fire. 

Thruster ® fires at about ( = 17 and l = 19 seconds. Since no other thrusters are firing 
at these times,. it does net -take long to identify it as a suspect, which occurs at l = 20 
seconds. Thruster © simulates a complete thruster failure, producing only about. l/40tb 
of the thrust from a nominal thruster. It stays on for several seconds, yet the error plots 
are fairly straight lines during this period, indicating constant ftiouienlum and very little 
thrust. The “Model-Building" process confirms this at t s: 23 seconds, removing it from 
the list of suspects. 

With an empty list of suspects, thrusters (T) 0 and Q) are labeled for artificial excitation. 
Thruster Q), the other 90" elbow ( the second strongly-destabilizing 'ailure) is fired for the 
first time, causing a second loss of s tability. Thruster ($) had not been excited up until this 
point for two reasons: 
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1. The artificial excitation algorithm restricts thruster use to those thrusters that have 
already been tagged as suspects, unless regulation errors exceed a certain limit. 

2. The new characteristics for thruster ($) match the nominal characteristics of (§> except 
that 0 is more efficient in producing torque, This makes Q) more likely to fire than 
0 for most (but not all) force- vector requests. 


These two effects conspire to prevent the firing before the first jsluirt pulse occurs ai 
t = 26 seconds (when hath of the above conditions allow firing for the first time), and then 
for a sustained firing at t = 28 seconds (when thruster Q) is targeted for artificial excitation 
to expedite the identification)-. 

The instability is caught quickly, since the rest of the plant is well-characterized at this 
point in *he identification. When its new characteristics are confirmed at f = 3 6 seconds, 
this represents the identification of the sixth thruster failure and the final reset of the neural* 
network training process (the final major change detected in thruster characteristics). 

Thrusters (?) .and (J), the only uu-altered thrusters, are confirmed to have nominal 
characteristics at t = 48 seconds, marking the end of the artificially-excited identification 
pha.se. Fjrom this point on. model updates are small, and made only when they exceed a 
certain threshold, So as riot to disrupt the neural-network* training process. In this case, one 
final minor adjustment was nude at t 95 seconds. 

With the completion of identification (ail thrusters identified to a high level of confi* 
deuce) at i := 48 seconds, artificial excitation ends, and the nolo objective of the controller 
is to regulate, to the desired position. Position errors i.n all degrees of freedom -arc reduced 
immediately, as seen in the top half of Figu re 6.10 between t = 48 and / = 55 seconds. There 
is some overshoot in t/\ peaking at t = GO seconds. This is due primarily -to the deadband 
associated with the on-off thrusters 6 . Following thi< single major overshoot, the regulation 
error is reduced to bo well- within the static deadband, and results: in an occasional single 
thruster pulse, as shown in llm thruster-firing plot from t » 66 — 80 seconds. 


* 


6 Due to the control ntrurlure, Pr. u contro1 gains, and thrustcMnapphti} cost fund ion selected, a dcadba.hd 
within which the tlitintcfs will not fire, even ssitli ah optimal thruster mapper. While ihenuo of this 
deadband ia cl iflirolt ' - riiaractc rii«S due to the thrustei-coupli i# effer U. the fnaxirriUm stMic deadband 
(fofttjming mo Veloci t rcr and erior in erne degree of freedom only 1 ) i a upproMinahly 2.0 cm in translation 
and 10.0° in )nvs angle (s*i*ii the horiUiial thrusler couhgural tuft]. 
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6.8 Summary of Experimental Results 

The performance of the neural-network-based reconfigurable Control system, displayed in 
Figure ti.lQ, was excellent, providing stabilization of the robot within four seconds, despite 
the presence of six major thruster failures, 

I he system-level design of the control system, discussed in Chapter 3, resulted in the 
selection of a linear-systems approach for identification, but a neural-network approach for 
the thruster mapping. The decision concerning identification was critical to achieving the 
■■ quick failure detection and identification that resulted (initial recovery occurred after only 
four .seconds). The neural-network mapper provided flexibility in adapting to the changes 
in thruster characteristics. 

The new Fully-Couaected Architecture, discussed in Chapter *1. allowed the neural net- 
work to make immediate use of the model provided by the identification component. A 
linear approximate thruster mapper was calculated immediately following the initial failure 
detection at t s 4 seconds. Implementation of this linear solution, with the FCA provided 
immediate stabilization. This was followed by optimization of the nonlinear portion of the 
neuraJ network, resulting in near-optimal performance wrthin 2 minutes. This performance 
was obtained despite the implementation on a serial microprocessor; implementation on 
parallel-processing hardware would provide dramatically faster performance. - 

The new learning algorithm described in Chapter 5 was used to allow gradient -based 
optimization, in spite of the presence of the non-diflerentiable thrunters. The u:;e of gradient 
information to direct the optimization resulted in a dramatic improvement in learning rate 
over what could have been obtained with a method that was not gradient-based. 




Chapter 7 


Conclusions 


This final chapter consists of two sections. The first section summarizes the findings of this 
research. The second gives suggestions for future research. 


7.1 Summary 

This thesis has described four now developments in neural-network control that grew out 
of a research program using a laboratory- based experimental prototype of a free-flying 
space robot. The advances were motivated by, and developed for, a complex rf configurable 
thruster control problem applicable to real spacecraft. Focussing on a specific complex 
control task was useful ift identifying some of the real-world issues in neural-network control. 
The work has led to the conclusion.*- • hat follow 

7.1.1 Systf*rn-Lev«*l Design Approach: The Superiority of Hy brid Control 

One basic conclusion from this research i-» that a combination of the nonlinear processing 
capabilities of neural networks with existing conventional control theory raft be very pow- 
erful. A careful system- level analysis and- design that considers the costs and benofim of all 
available tools from the fields of nonfat networki* mid control is likely to be more successful 
than an approach that ha. 1 , already decided up- front what toots will be used. An objective 
evaluation of the costs and henefits of each available approach, followed by art efficient in 1 
tegfation of these approaches, and development of extensions to existing theory where they 
are needed, constitutes a powerful strategy for solving complex nonlinear control problems 
in the real world. 
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In this work, the costs and benefits of neural network approaches have been outlined, 
aud objectively compared with alternative conventional .approaches. The overall success 
of the. recon figurable control system resulting front application of_thi< strategy provides 
additional support .to this conclusion. 

To summarize the criteria for valuable applications of- neural networks, it Was shown 
that applications should involve systems with inscrutable (if the exact form can be derived, 
that will probably be more effective than a neural approach) nonlinearities (if the system is 
purely linear, linear methods tend to have better convergence and provability characteristics 
than do neural networks) that may require some form of adaptation (neural networks excel 
here, since they are already designed for iterative training). Additionally, neural networks 
are well .suited to application!) that require the processing speed of a parallel computer, 
since their architecture is inherently parallel. 

7.1.2 Quick Adaptation - FCA 

A major issue in neural network control, and particularly in ^configurable or adaptive 
control, is the requirement for speed of adaptation. The control application addressed 
here highlights this need, since the robot suffers a destabilizing change in its actuators. The 
instability required a recovery within seconds, no* minutes. or hours. And this was achieved. 

A new fully -ccmU'cted neural .network architecture (FCA) was developed to address 
this speed issue. It is a feedforward network that brings together for the first time many- 
useful architecture features developed by other researchers It has connections beyond those 
provided by a layered network, yet is trainable with backpropng.lt ion . Aided by a. systematic 
complexity cont rol scheme, thi.s network was shown to have certain advantages over layered 
networks, particularly for control problems. 

The most significant advantage in this application is the ability to incorporate seamlessly 
a linear solution before training begins. In control, its with other fields, linear approximate 
solutions are often easily calculated based. upon prior knowledge of the system properties. 
Quickly calculating the linear approximation, and directly inputting that solurion into the 
neural network facilitates rapid adaptation without the need for time-consuming iteration. 
This feature may be ospocif.lly useful if it allows immediate stabilization, as it does here. 

Another feat are of tin? FCA that contributes to its rapid rate of adaptation is the growing 
of flm network. A.. smaller (fewer hidden neurons) network converges more rapidly, since 
there. are fewer parameters to adapt, fewer calculations need to be made, and a smaller 
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training set. is required to prevent over-fitting. The network begins with a small number of 
hidden neurons, and gradually adds more, as greater functional capacity is called for. 

7.1.3 Gradient-Bused Optimization for DVFs - Noisy Sigmoids 

A new technique was developed that extends gradient-based optimisation (e.g. backprop.- 
agation learning) for the first time to systems involving discrete- valued functions (DVFs) 
(which are not continuously differentiable). This approach was motivated by the need for 
adapting to the changing properties of the on-off thrusters used to control the robot. The 
solution to this difficult, but specific problem is to approximate the DVFs with noisy sig- 
moids. Tills simple solution has been demonstrated to extend to other applications involving 
optimization with DVFs. Qr..e important example is for neural networks built with hard- 
limiting iiottlinearitii.’S rather than sigmoid functions. TheSe are attractive because they are 
cheaper and easier to implement in hardware. Another example is design optimization for 
systems with DVFs (e.g. a structural design optimization that chooses between 1/4 inch 
and 3/8 inch waff thickness, 3, 4, 5. or 6 screws, and 2 or 3 beams). This has noi, yet been 
demonstrated, but it is expected to work well. 

The modification to backpropagation is very small, simply requiring continuous-approx- 
imation of the DVFit. and injection of noise on the forward sweep; yet the improvement in 
network performance is dramatic. 

It works by solving the problem Unaddressed by earlier methods: roundoff error. For 
gradient-based optimization to work, during training a gradient must exist and bo non- 
zero; so the obvious first step is to approximate the DVF with a continuous approximation 
which is continuously differentiable (e.g. signlold-based functions). This method provides 
jsome success, but. errors result when extensive use of the transition regions occurs during 
training, af.d round off to the. nearest discrete level is required at ruf. time. 

Identifying this roundoff error as the problem was probably as important a step as 
the solution. Identification Was aided by the ability to compare the results to a known 
optimal solution, as is known for the thruster-mapping problem. Without knowing the 
level of performance Ural Was possible, the problem of roundoff error might never have been 
ideflf ilir.d. 

Once the problem Jwas identified, several attempts were nude to address it. The sue- 
ceiwft.il method involves the simple modification of injecting noise into the sigmoid dating 
training. Noise creates random outputs if the transition regions are used, blit has littleeffeci 
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if saturated regions close to the allowed discrete levels are used. Therefore, the transition 
regions, are avoided during training, and roundoff error is minimal at run time. 

7.1.4 Experimental Demonstration of Reconfigurable Control System 

The task of rapid reconfiguration in response to destabilizing thruster failures first motivated 
these developments and then drew upon them heavily in the experimental dcniaxistcation. 

« The system-level control design approach resulted in et system related to a conventional 
indirect; adaptive control system, find used a neural network as an efficient, adaptive 
method to implement the nonlinear thruster mapping component. 

♦ The FCA resulted in near- immediate stabilization and rapid learning, due to the 
feedthrough connections, and growing of the network. 

♦ The gradient -based optimization for discrete- valued functions resulted in a trtore ac- 
curate mapping due to a good approximation to the on-off thrusters, while allowing 
the rapid optimization made possible with use of gradient information. 

When trained off-line and tested experimentally on the real robot, the neural- network 
thruster mapper provided near optimal performance during multiple-degree- of- freedom tra- 
jectories. Arbitrary accuracy could be obtained depending upon the size of the network 
used. With no thruster failures (so symmetries may be used) and fi hidden neurons, a 
thruster-mapping force error of 3.5$* was achieved, This small error is barely peiceptibie 
due to the use of feedback in the control system. 

When reconfiguring the control system in response to prcvionsly-unknown, major, desta- 
bilizing thruster failures, rapid stabilization and optimization were achieved,- as seen, in 
Figure 6.1.0. Detection of a destabilizing failure took from 2-5 seconds (the problem is 
complicated due to noisy accelerometers, und to firing multiple thrusters simultaneously). 
After the initial detection, calculation J»f a etabliziiig linear approximate solution, and im- 
plementation via the FCA look less than one second. As thrusters are suspected to have 
I hanged characteristics (c.g. to be angled at 45° or 90“. have degraded (lirust output, of 
be plugged completely), they are artificially excited to speed up the identification. Sta- 
bility and closed-loop control are maintained during this time. With six out of the eight 

'Due to the us*- of discrete-valued actuators, there it almost always a force error veitot. 1 lie ittor value 
reported litre indicate) that the average mnfiiiiuid* ol the fo’ce error vector in 1.033 time* the maRnitud* 
aehirVaMe with the optimal thfunter mapper. 
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thrusters failed (tsvo- wore strongly destabilizing), the identification converges within about 
60 seconds. The neural network thruster mapper is trained concurrently with the identifi- 
cation, and the model used for training is continuously updated. Near-Optimal performance 
is achieved by the end of the identification phase (e.g. 20% error above optimal), and it 
improve.*; to arbitrary accuracy with further trainin g and g rowing of the network. 


7.2 Recommendations for Future Work 

Performing this research generated a number of ideas for possible future research. The 
following is a list of possible future project ideas. As this research has encompassed a broad 
range of issues, from the details of an experimental implementation to the derivation of 
a new optimization algorithm* the following suggestions have been grouped into Specific 
areas. 


7.2.1 Integration of Neural-Network and Conventional Control 

* One of the conclusions of this research has been that the merging of neural network 
technology with control systems engineering can lead to the development of highly- 
capable control systems. Much neural network theory and much control theory already 
exist that could produce significant advances in coat ol capability simply through 
astute integration of them. With this in mind, some possible research areas that are 
related to the robot application are suggested. Control systems for physical plants 
t licit are. difficult to model, and have inscrutable nonlinearifiies are good targets. These 
may include hi&h-attgta-of-attack aerodynamics, or underwater robot control. 

• This research has presented a icconfigurable control System implemented in real-time. 
HecoiifiguraSle control \u an important area of research in the military aircraft indus*. 
try, an it is desirable, to have a control syntem that can recover from partial system 
failure - tag. battle damage, where portion of a wing is shot off, or Some control surfaces 
become inoperable. Neural networks ato attractive for this application duo tu their 
ability to deal with Tie nonlinear aerodynamics, adaptive capability* and real-time 
processing Mpeid if implemented in hardware The reconfiguration time requirement, 
for an unstable aircraft is likely to be measured in hundredths of n second, ins-tod of 
seconds. as for the space robot application. Hardwire implementation of the concepts 
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developed here, combined with further developments, tailored to the aircraft applica- 
tion, could make this goal feasible. A memory-based approach may be required due 
to the high speed requirement and limited data availability. 

♦ Feedforward neural networks built with sigmoidal activation functions were used ex- 
clusively in this research, primarily because they appear to hold much promise for 
neural network control applications in general.. Other neural network architectures 
exist, and may prove to offer advantages depending upon the application: 

1. Radial Basis Function (RBF) networks may be viable, as described in Chapter J. 

2. Sigmoidal (and RDF) networks, work by attempting to form a function that “fits" 
the data, (training cases) they are presented. The hope is that this function forms 
a generalization of the training data, and the network will perform well on new 
data. However, other roundly- motivated approaches are memory-based, rather 
than function-based. Rather i lian learn a generalizing function of the data, these 
methods remember the training inputs directly, and interpolate/oxtrapolate as 
needed when new points arc input. CM AC [2] [3] is one example of a memory- 
bteed neural network that ha:; boon used successfully in control applications [23]. 
Briefly, the tradeoff is that memory-based approaches learn very quickly, since 
they simply remember each training input} but the iece.ll can be much slower, 
since the nearest neighbors must, be found and then interpolated to produce an 
output. Function-based approaches train more slowly, as they must compress 
the-data. into the functional format created by the network topology, hut have 
very fast recall. 

7*2.2 Optimal (Hybrid) Combination of Neural Networks with Conven- 
tional Control: FCA 

• The ability to incorporate prio;rknov|rdgc has proven to be the most useful aspect 
of the I'CA for this application. It is currently limited to linear solutions, Extensions 
to other types of solutions, perhaps closely tailored ’o conventional control methods 
may be useful. 


• Fuzzy logic lias been an area of recent interest in the control community recently. The 
appeal of fuzzy logic is its ability to incorporate knowledge from a h tin. in expert into 
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a logic system. A number of rules are programmed by the expert (e.g. “if you’re close 
to youf destination, and the brakes aren’t too hot, and you're going medium speed, 
and there are no immediate-obstacles, then apply the-bmkes gently”), and the fuzzy 
logic is used to blend the effects of these rules together, in a more-graceful manner 
than is possible with crisp logic. This ability to interface with human expertise has 
proven to be useful for tasks for which-such human expertise-can be encoded into a 
logic system. Research in incorporating this type of knowledge may be useful. Again, 
an astute “ hybrid” combination of fuzzy logic and conventional control may well be 
superior to either alone. 

• The major drawback is the possibility for overfitting, and a complexity control method 
was applied lo address this issue hero. Several other neural network pruning; methods 
exist that may deserve investigation. 

• The general problem of determining the optimal topology of neural networks (includ- 
ing the number and connections of nonlinear elements) remains an important research 
issue. The technique presented here (use of the FCA to allow the implementation of 
any possible set of layers or connections, and the gradual addition of hidden neurons 
until acceptable performance is reached) provides a workable solution, but there is 
room for other advances that may improve the efficiency of the topology selection. 

7,2.3 Gradient-Based Optimization for DVF$ 

• A significant feature of the algorithm presented here te that the specific nvpe of noise 
used. (e.g. Gaussian, uniform, etc.) is not important. Furthermore,, for bi level DVFs, 
the algorithm is robust to wide variations in the magnitude of the noise distribution. 
Although tuning the noise level is n relatively simple operation, elimination of this.-, 
requirement is a clear advantage. Unfortunately, for DVFs with more than two levels, 
tuning of the noise level is required (although selection is fairly robust arid intuitive). 

It has been suggested that this tuning may ho avoided if a different form of the 
continuous approximation function is chosen [45]. 

• The algorithm has been applied to two very different applications so far * optimization 
of a neural network built with hard-limiters, and optimization of a neural-network con- 
troller for a 5VhU*rti with on-off actuators. The simplicity of the algorithm, combined ... 
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with the success on two unrelated problems, causes optimism for the applicability of 
the algorithm to other fields. One clear application is design optimization, as men* 
Honed In the research summary above. 

• The application to neural networks built with hard-limiters has provided an efficient 
training algorithm for a new class of neural network hardware. Study pf the details of 
such an implementation, or modification of the .algorithm to allow for on-Chip learning 
would be beneficial. 

7.2.4 Thruster Mapping 

These suggestions reflect further advancements toward a better thruster control system. 
This project was chosen as a challenge problem to highlight some of the current issues in 
neural network control. However, if the goal were to make the best thruster control system 
possible, these are some issues that have not been fully addressed in this research. 

• A more complex mathematical model of the robot could be used: 

1. Include thruster transients: due to the response time of the solenoid valve, and 
additionally, the finite size of the chamber between the valve and the nozzle, the 
thrust output is time-dependent. These effects were ignored. 

2. Include low-gas-reservoir effects: the amount of gas remaining in the high and 
low pressure reservoirs affects the thrust output In these experiments, reservoir 
levels were kept close to nominal so these effects were minimized (and ignored). 

d. Account accurately for multiple thruster firings: due to limited flow* in the plumb- 
ing, -the thrust from- each thruster is reduced when multiple-thrusters are fired 
simultaneously (on the order of lO 1 ^ loss per extra thruster). A simple linear 
approximat ion was used for these experiments. 

• Identification of thruster characteristics is performed by analyzing, the direct relation- 
ship between thruster firings and the resulting acceleration. While this is a robust, 
self- contained ID scheme that meets the requirements of this application, incorpo- 
ration of position information (available from a vision system or Global Positioning 
System) with a Kalman filter should ituproie the identification. 
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• The initial decision to separate the controller into control (PI) controller that ignores 
discrete-actuator effects) and thruster. mapping components, was made to simplify 
the problem. It simplified the problem at the expense of optimality. A subsequent 
step, made possible by the developments .in this work, Is to merge the robot-base 
controller and thruster mapper design into a single component. This should result in 
improved total system performance, as the neural, network provides a fast method for 
calculating an approximation to the Optimal control solution that can be calculated 
in real time. One approach could be to use the network for trajectory optimization, 
accounting for the on-off actuators. 
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Appendix A 


Thruster-Mapping Cost Function 


This Appendix presents a variation on the cost function used to define the optimal thruster 
mapping. This function places weighted costs on the force-mapping error and the amount 
of gas used. 

The cost functions that were used for the neural-network developments m Chapters 4 
and f> and the experiments in Chapter 6 were both presented, in Chapter 2. The complexity - 
control term used to augment those cost functions was presented, in Chapter 4. This Ap- 
pendix presents an alternative tost function that has merit, but was not used extensively 
in this research. 

In minimizing the force error (and possibly also gas usage) only, tho thruster mapper 
does not consider the dynamics of the plant. It assumes that the F*., vector output by 
tho controller feedback law is chosen carefully enough that it needs cnly concern itself with 
producing the closest matching Fact- In fact, in this application, the controller component 
is a simple proportional-derivative controller (shown in Figure 2.12) that, dues not take into 
account the thruster limitations- 

The decision to separate control and mapping components was made largely for sim- 
plicity in design. Ideally, the controller component would be aware of thruster limitations - 
for example, a bang- bang controller instead of a PI) controller. Implementing a bang- bang 
controller in real tirnc would probably result in the same decision that was made for the 
thruster mapper here: use a neural network to implement a nonlinear approximation to 
the optimal controller - one that can be computed in leal time. In this rase, it may be 
beneficial to merge the neural-network control component with the neural-network mapping 
component. The single neural network would then map the six-element state error vector. 
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x u „ - (aw, Utn > V«m *ert, lltn, V’err), directly to the binary eight -element vector of 
thrusters to fire, T - [7\ T 2 7* T< 7't 2 S 7V 7$]. This even-morc-complcx nonlinear control 
problem is not addressed here, but n m nth-simpler first step is proposed, 

A first step to address the presence of plant -dynamics in the thruster mapper is to use 
an alternative error-weighting scheme. This .plan does not address the on-off nature of the 
thrusters directly, but docSanrorporate the effect of variations in the mass properties Of the 
robot. For example, if the moment of inertia were relatively small compared to the mass, it 
might be more important to match the desired torque than the desired translational forces. 

In this plan, instead of minimizing normalized force error, the force error is considered 
to he a disturbance, and the resulting normalized acceleration error vector is minimized. 
The normalization factor chosen is the acceleration vector resulting at the perimeter of the 
base, radius r, when a single thruster is fired, In this instance, the error becomes: 
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net torque ecro* nboiil y-aKiS, (r v . |lt> - r v ', ttil ), resulting from T 
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robot moment of inertia about y axis 
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Due to the dimensions and mass properties of the roIu.it used in these experinioiiU, this 
ends up being close to the original cost function, and the actual performance improvement 
in this cane is minimal. 
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